... | ... | @@ -61,14 +61,14 @@ Entry inside ___solrconfig.xml___: |
|
|
</requestHandler>
|
|
|
```
|
|
|
|
|
|
Parameters:
|
|
|
_Parameters:_
|
|
|
|
|
|
* rows: picks up the top N result list documents for clustering
|
|
|
* relevantTermsSummarizer=[true|false]: enables cluster summarization
|
|
|
* maxIntTerms: how many label terms should be extracted per cluster
|
|
|
* maxDocsPerCluster: prunes the number of documents per cluster that will be considered for cluster summarization. This is to adjust performance issues, default is 100
|
|
|
|
|
|
Examples:
|
|
|
_Examples:_
|
|
|
```
|
|
|
http://earlytrendradarservice.kl.dfki.de/solr/etrCollection/clusteringSTC?q=%2BdynaqCategory:brandwatch+%2Btitle%3Ascreen+%2Btitle%3Anews+%2Bmodified%3A[20130301000000000+TO+20140201000000000]&rows=100&fl=&relevantTermsSummarizer=true&maxIntTerms=42&maxDocsPerCluster=50
|
|
|
```
|
... | ... | @@ -90,14 +90,14 @@ ___solrconfig.xml___ entry: |
|
|
</requestHandler>
|
|
|
```
|
|
|
|
|
|
Parameters:
|
|
|
_Parameters:_
|
|
|
|
|
|
* contextDocId: an id to specify a context. Can be specified several times
|
|
|
* contextDocsField: the attribute which should be considered for similarity search. Normally the full body text of an document
|
|
|
* contextDocsBoost=[number]: boosts the context docs with an multiplication factor to the scores
|
|
|
* includeContextResults=[true|false]: Set it to true if you want to include to the final result also all documents that are similar to the context docs but doesn't match the query
|
|
|
|
|
|
Examples:
|
|
|
_Examples:_
|
|
|
```
|
|
|
http://earlytrendradarservice.kl.dfki.de/solr/etrCollection/dynaq?q=%2B%28dynaqCategory:brandwatch%29&contextDocIds=http://www.usatoday.com/story/news/nation/2013/02/14/drought-farmers-midwest/1920577/&contextDocsField=body&rows=10&fl=dataEntityId,title,creator,score
|
|
|
```
|
... | ... | @@ -119,15 +119,6 @@ EmptyFlResponseCleaner is configured as last-component: |
|
|
</requestHandler>
|
|
|
```
|
|
|
|
|
|
Parameters:
|
|
|
|
|
|
*
|
|
|
|
|
|
Examples:
|
|
|
```
|
|
|
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### Document group summarizer ###
|
... | ... | @@ -157,13 +148,15 @@ Enable the module in your ___solrconfig.xml___: |
|
|
</requestHandler>
|
|
|
```
|
|
|
|
|
|
Parameters:
|
|
|
_Parameters:_
|
|
|
|
|
|
*
|
|
|
* maxIntTerms: the maximum count of potentially interesting terms the system should extract. default: 42
|
|
|
* rows: specify the length of the result list, thus the number of top documents that should be considered for extraction
|
|
|
* further, all parameters from the Solr MoreLikeThis component can be specified
|
|
|
|
|
|
Examples:
|
|
|
_Examples:_
|
|
|
```
|
|
|
|
|
|
http://earlytrendradarservice.kl.dfki.de/solr/etrCollection/docgroups/relevantTerms?q=title:shark&maxIntTerms=42&rows=50
|
|
|
```
|
|
|
|
|
|
---
|
... | ... | @@ -201,15 +194,6 @@ Further, you have to configure it to each SearchHandler for which you want to re |
|
|
</requestHandler>
|
|
|
```
|
|
|
|
|
|
Parameters:
|
|
|
|
|
|
*
|
|
|
|
|
|
Examples:
|
|
|
```
|
|
|
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### Trend Analysis ###
|
... | ... | @@ -225,7 +209,7 @@ Data for each single time range segment: |
|
|
* Momentum (second derivation) in this segment
|
|
|
|
|
|
Aggregated data for the whole time series analysis:
|
|
|
* Overall Amplitude of the trend: Number of results, amount of results (percentage), sum of relevancies, average relevancies
|
|
|
* Overall amplitude of the trend: Number of results, amount of results (percentage), sum of relevancies, average relevancies
|
|
|
* Slope average, for both result count and relevancies
|
|
|
* Momentum average
|
|
|
|
... | ... | @@ -253,12 +237,24 @@ Entry in ___solrconfig.xml___: |
|
|
</requestHandler>
|
|
|
```
|
|
|
|
|
|
Parameters:
|
|
|
_Parameters:
|
|
|
_
|
|
|
Trend analysis:
|
|
|
|
|
|
*
|
|
|
* range: the time range inside the corpus that should be analyzed. You can use standard solr date syntax, or dynaq date syntax, depending what you use in the index for this field. DynaQ syntax is [-]yyyyMMddhhmmssSSS, whereby y can be specified as often as you want.
|
|
|
* granularity=<number><time unit>: the overall time range is segmented for calculation, for each segment you will get analysis results. This parameter specifies the length of the time segments. Possible values for the time unit: w(eeks), d(ays), h(ours), m(inutes), s(econds), [ms oder S] (milliseconds), p(ercantage from whole range), M(onths), e.g. 5w=> five weeks
|
|
|
* slicedata=[true|false]: if true, the data for all slices/time segments will be returned. If false, only the overall information will be added to the response. Default is true, so you can print a ptime series graph.
|
|
|
* relevanciesAndDocs=[true|false]: per default, the document ids and the relevancy data won't be considered for calculation, for performance reasons. If you want them anyway, you can get them by specifying relevanciesAndDocs=true
|
|
|
|
|
|
Forecasting:
|
|
|
|
|
|
Examples:
|
|
|
* predictionLength=<number>: enables forecasting. The response gets additional slices/time segments for the future time after the specified analysis time range. This parameter specifies the number of additional time segments the systems approximates into the future
|
|
|
* timeSeries=<time series as number array>: instead of triggering a trend analysis and then appoximate into the future with the resulting time series, you can also specify a time series directly with this parameter, skipping the trend analysis step
|
|
|
|
|
|
_Examples:_
|
|
|
```
|
|
|
|
|
|
http://earlytrendradarservice.kl.dfki.de/solr/etrCollection/trends?q=title:screen&range=modified:[20130301000000000%20TO%2020140201000000000]&granularity=1M&fl=dataEntityId,score,title
|
|
|
http://earlytrendradarservice.kl.dfki.de/solr/etrCollection/trends?q=title:screen&range=modified:[20130301000000000%20TO%2020140201000000000]&granularity=1M&fl=&predictionLength=10
|
|
|
http://earlytrendradarservice.kl.dfki.de/solr/etrCollection/trends?timeSeries=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]&predictionLength=5
|
|
|
```
|
|
|
|