Socorro/ElasticSearch API: Difference between revisions

Replaced content with "This page is now in Socorro/Middleware."
(→‎Search: Adding order, better styling.)
(Replaced content with "This page is now in Socorro/Middleware.")
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Middleware API for ElasticSearch =
This page is now in [[Socorro/Middleware]].
 
'''This is a draft''' of the new API for querying ElasticSearch through the middleware API of Socorro.
 
The Middleware API aims to separate the front-end from the back-end by providing an interface to access the data. By doing so, the front-end will not have to care about the storage system, and will retrieve data from Hbase, PostgreSQL or ElasticSearch in a consistent and simple way, through our REST API.
 
The API is separated in several categories / entry points:
* /query
* /search
* /report
* /crash
* /stats
 
This categories are explained below.
 
== Version ==
 
Every URI is prefixed by a version number, so final URIs should look like: http://example.com/(api_version)/(request)/.
 
== Query ==
 
=== Description ===
 
Low level query, just sends a JSON query to ES directly, and returns the result of this query.
 
=== API Spec ===
 
HTTP request: '''POST'''<br>
Data: JSON query to send to ElasticSearch<br>
URI: '''/query/[(''types'')/]'''
 
* ''types'': Types of data we are looking into. If omitted, default value is _all. Several types can be specified, separated by a + symbol.
 
=== Return value ===
 
This request returns the exact data the storage system returned.
 
=== Example ===
 
<pre>curl -XPOST 'http://example.com/110505/query/crashes/' -d '{
    "query" : {
        "match_all" : {}
    }
}'</pre>
 
== Search ==
 
=== Description ===
 
Searches for crashes and returns them. This search is highly configurable, but can also be really simple using default values.
 
=== API Spec ===
 
HTTP request: '''GET'''<br>
URI: '''/search/(''types'')/for/(''terms'')/product/(''product'')/from/(''from_date'')/to/(''to_date'')/(''optional_parameters'')'''
 
* <tt>types</tt>: Type of data we are looking into. Can be set to <tt>_all</tt> to search into all types. Several types can be specified, separated by a + symbol.
* <tt>terms</tt>: Terms we are search for. Each term must be URL encoded. Several terms can be specified, separated by a + symbol.
* <tt>product</tt>: The product we are interested in. (e.g. Firefox, Fennec, Thunderbird... )
* <tt>from_date</tt>: Search for crashes that happened after this date.
* <tt>to_date</tt>: Search for crashes that happened before this date.
 
<u>'''Order:'''</u>
 
Parameters, except for the first one (search/(type)/), can be set in any order. Optional and mandatory parameters can be mixed. (See the example below. )
 
<u>'''Optional parameters:'''</u>
 
This request has some optional parameters that can be omitted. Any omitted parameter has a default value or is not used while querying ES. You can use only some of those parameters or all of them.
 
The complete URI is as follow:
/search/(''types'')/for/(''terms'')/product/(''product'')/from/(''from_date'')/to/(''to_date'')/'''in/(''fields'')/version/(''version'')/os/(''os_name'')/branches/(''branches'')/search_mode/(''search_mode'')/reason/(''crash_reason'')/build/(''build_id'')/report_process/(''report_process'')/report_type/(''report_type'')/plugin_in/(''plugin_in'')/plugin_search_mode/(''plugin_search_mode'')/plugin_term/(''plugin_term'')'''
 
* <tt>fields</tt>: Fields we are searching in. Several fields can be specified, separated by a + symbol. Default value is search in all fields.
* <tt>version</tt>: Version of the product. Can be set to <tt>_all</tt> to search into all versions. Default value is search in all versions.
* <tt>os_name</tt>: Name of the Operating System. (e.g. Windows, Mac, Linux... ) Default value is search in all OS.
* <tt>branches</tt>: Several branches can be specified, separated by a + symbol. Default value is search in all branches.
* <tt>search_mode</tt>: Set how to search. Can be either <tt>is_exactly</tt>, <tt>contains</tt> or <tt>start_with</tt>. Default value is contains.
* <tt>crash_reason</tt>: Restricts search to crashes caused by this reason. Default value is empty.
* <tt>build_id</tt>: Restricts search to crashes that happened on a product with this build ID. Default value is empty.
* <tt>report_process</tt>: Can be <tt>any</tt>, <tt>browser</tt> or <tt>plugin</tt>. Default value is any.
* <tt>report_type</tt>: Can be <tt>any</tt>, <tt>crash</tt> or <tt>hang</tt>. Default value is any.
* <tt>plugin_in</tt>: Search for a plugin in this field. <tt>report_process</tt> has to be set to <tt>plugin</tt>. Default value is empty.
* <tt>plugin_search_mode</tt>: How to search for this plugin. <tt>report_process</tt> has to be set to <tt>plugin</tt>. Default value is empty.
* <tt>plugin_term</tt>: Terms to search for. Several terms can be specified, separated by a + symbol. <tt>report_process</tt> has to be set to <tt>plugin</tt>. Default value is empty.
 
=== Return value ===
 
The full JSON documents that meet the search parameters. ''JSON documents schema to be determined.''
 
=== Example ===
 
<pre>http://example.com/110505/search/crashes/for/libflash.so/in/signature/product/firefox/version/4.0.1/from/2011-05-01/to/2011-05-05/os_name/Windows/</pre>
 
== Report ==
 
=== Description ===
 
Get a specific report.
 
=== API Spec ===
 
HTTP request: '''GET'''<br>
URI: '''/report/(''report_name'')/(''report_parameters'')'''
 
==== <u>Top Changers by Signature</u> ====
 
URI: '''/top_changers_by_signature/product/(''product'')/version/(''version'')/from/(from_date)/to/(to_date)/'''
 
* ''product'': The product we are interested in. (e.g. Firefox, Fennec, Thunderbird... )
* ''version'': Version of the product.
* ''from_date'': Only crashes that happened after this date.
* ''to_date'': Only crashes that happened before this date.
 
==== <u>Top Crashers by Signature</u> ====
 
URI: '''/top_crashers_by_signature/product/(''product'')/version/(''version'')/from/(from_date)/to/(to_date)/'''
 
* ''product'': The product we are interested in. (e.g. Firefox, Fennec, Thunderbird... )
* ''version'': Version of the product.
* ''from_date'': Only crashes that happened after this date.
* ''to_date'': Only crashes that happened before this date.
 
==== <u>Top Crashers by UR</u>L ====
 
URI: '''/top_crashers_by_url/...'''
 
==== <u>Top Crashers by Domai</u>n ====
 
URI: '''/top_crashers_by_domain/...'''
 
==== <u>Top Crashers by Topsite</u> ====
 
URI: '''/top_crashers_by_topsite/...'''
 
== Crash ==
 
=== Description ===
 
Searches a crash by it's OOID and returns it. This query is already implemented in the Middleware.
 
=== API Spec ===
 
See http://code.google.com/p/socorro/wiki/APICalls
 
== Stats ==
 
=== Description ===
 
'''This is a proposition.'''
 
Get some statistics around the data. E.g. counting by OS, by product, by ADU, by build... The difference with report is that stats only send back numeric data, counting through the entire data set or in a certain date range.
Confirmed users
245

edits