编程知识 cdmana.com

From the improvement of elasticsearch search search experience

1、 Practical problems

Friends ask questions : How to search for the best results ?

I have a search function here , The way to achieve this is to use ik Word participator with multi Query implementation .

In the middle of the process, we also added a vocabulary dictionary related to the customer's field .

But customers have been feedback that the search experience is not good .

If you want to improve the search experience, what else can we do ?

come from : screwing Elasticsearch Knowledge of the planet

This is a very representative problem , I've also encountered... In actual product development .

2、 Take a few examples of the search experience

Example 1 :“ Longed for X network ” Input “ trigger ” Search screenshots of .

Be careful : What I input is “ trigger ”, Back to the first item, no problem , The others are about :“ touch ”、“ Hair ” Of , It has nothing to do with my search .

From the perspective of user experience , In my submission : Experience is very poor , Returned a lot of irrelevant data .

Example 2 : A question bank APP, Page skipping is not supported .

As shown below , Question bank 1703 topic , contain : Judgment questions 、 choice question .

Only support : Click on : Last question 、 Next question .

The actual scene :

  • As the 100 Avenue 、200 In the time of Tao , Only multiple choice questions ; How many multiple choice questions ?

  • When you quit , You need to click hundreds of times to enter the last problem you did last time .....

It's not that the user experience is poor , It's no user experience , The developers didn't design at all , Users will " Doubt life ".

Example 3 : E-commerce search “ The first autumn pants ”, What to return ?

Zoom in to see the picture , The bright spot appears

It's a matter of opinion , What are you going back to , Each e-commerce company has its own judgment .

but , From the user's point of view , Degree awarded .

Ming Yi comments on :

  • A lot of spelling

“ You deserve your fast development ”, Indeed, the return result is the expected result , And friendly recommended the location of “ Johns ” Information .

  • TaoBao

The compasses , At least return to “ Johns ”.

  • JD.COM

No merchandise found , Recommended for you “ Johns ”,“ Why recommend , It's over if you go back directly ”.

  • Dangdang

good heavens ! What is recommended is “ autumn ” The goods . You are the user , What do you think ?

  • “ fuck , What is it? ?”

  • “ Five tastes mixed ”

  • “ confused ”

......

Basically we can come to the conclusion that : The speed of the company's growth is proportional to the search experience .

3、 Where there's data , There's a search

Today, information is overflowing and exploding , Search is everywhere . Basically, it can be summarized as :“ Where there's data , There's a search “.

Search is probably one of the most commonly used features by users , Study 、 Work 、 Food, clothing, housing, transportation and other aspects are inseparable from the search .

  • Study

Enter keywords , Search for reliable free or paid Internet resources .

  • Work

Encountered error code , adopt Google Search for answers .

Search wechat chat records , Take a look at a key piece of valuable information we've talked about before .

  • clothing

Buy clothes online , It's actually a search 、 The process of selection .

  • food

Daily order takeout at noon , The process of choosing takeaway , It's the search process , Close to the company + The evaluation is high = The chance of placing an order is high .

  • live

Hotel reservation on business trip , search , Compare and choose a cost-effective .

  • That's ok

11 self driving tour , Before going out, Gode navigation , Enter destination search results , Based on the returned results , Choose the right route .

just as : The analysis of the search experience points out that :“ The design and usability of search box is a key point that can't be ignored .

A good search experience may not make users feel particularly good about your product , however A bad search experience can be fatal to your product .

So whether it's to provide better services to users , Or to avoid the negative experience of users , A good search experience is essential for a content-based product .“

Determine whether the search experience is good , Search results to meet the needs of users is the minimum threshold requirements , The following content is to bring good search experience 、 User concerns :

  • Search box :

1) Visually highlight the search box 、 Search box and magnifying glass icon In combination with ;

2) Put the search box where the user expects it to be ;

3) Provide search buttons ;

4) The right size

Say it rudely :“ Placing a search box in the most prominent position of the navigation bar is a minimum respect for users ”!

  • Searchable content tips : Tell users what they can search for

  • Every page should have a search box

  • Use smart recommendations / Matching mechanism

Intelligent recommendation or matching can save the user's input cost .

The average user is not very good at organizing search languages : In this case, if they don't express the problem clearly in the first step , Then it's hard to find the right search results next .

When intelligent matching works , It can help users express their search questions clearly , And find a satisfactory answer .

In a word , A good search experience is a good user experience , And good user experience is natural and user retention 、 Even the development of the company is linked .

4、 The five core links of user search are disassembled

“ Search is like users and App Or a conversation between websites , Users express their information needs by asking questions ,App Or the site responds by presenting the results .

Users expect a smooth search experience , And based on the quality of the search results, users will usually have a App The value of forming a quick judgment .”

In the process of searching , The user experience can be roughly divided into five parts , Namely : Discover search 、 Enter keywords 、 Wait for the result 、 View results 、 Complete search , The experience of each step is part of the overall experience , Will have an impact on the user's final search experience .

4.1 Discover search

As mentioned earlier , The search box should be eye-catching , The search bar will even be independent of the header and will be in UI The position in the interface that occupies the visual focus , It's easy for users to find .

4.2 Enter keywords

  • Be able to prompt users to , What key words to enter .

  • Can be based on the user input of a few key , give “ Search tips ”, Google's screenshot .

  • Complex combination search , Similar to Google 【 Advanced search 】, There should be auxiliary controls , Screening date 、 Exclude keyword settings 、 sort order 、 And or non expression, etc .

4.3 Wait for the result

  • Respond quickly , There is a limit to the user's patience , exceed 3 Seconds don't return , It is estimated that users will be lost .

  • If it's really slow , There can be responsive animations or prompt messages, friendly prompts .

  • Can recognize user input , Necessary results user history search habits , Return to optimal after integration TOP N result .

4.4 View results

  • The user returns according to the search , The process of screening .

  • If there is no result , It is recommended not to return directly “0” Bar result , There can be other recommendations , such as : Prompt users to change keywords, etc .

4.5 Complete search

  • There is a result of meeting the demand , End of the search .

  • Not satisfying the user's results , Users will change keywords to continue searching , Or the user stream loses something else APP Or the website .

To improve the user experience , One of these steps is indispensable 、 You have to work hard .

5、Elasticsearch The underlying logic of search

Understand the following two processes , You can understand Elasticsearch Search for .

The following is only for :text The text type of Full-text Retrieval .

5.1 Write indexation process

  • Document writing Elasticsearch It's not written directly , But according to you Mapping Defined word breaker ( Default :standard) participle , Write after building inverted index .

  • The selection of word segmentation device , Determine the granularity of the participle , The granularity of word segmentation determines whether the subsequent index can reach the standard .

5.2 Data retrieval process

  • Search link , Retrieval is not what input is , It's a different search statement , There will be different retrieval mechanisms .

  • Search link , What kind of search to choose , The results will be totally inconsistent .

such as :“match” Fine grained retrieval and “match_phrase” Coarse grained phrase matching , It's going to be a very different search result .

match: The keywords you input will be first segmented and then retrieved .

match_phrase: The words you enter will be retrieved as phrases .

6、Elasticsearch Quantifiable indicators of the search experience

User experience is a sensory response , But the sensory search results need to be quantified .

How to quantify ? The actual essential indicator is : Precision rate ( Accuracy )、 Recall rate ( Recall rate ).

6.1 Recall rate

Definition : The ratio of related documents contained in the search results to all related documents in the whole collection .

Measure recall of search results .

6.2 Accuracy

Definition : The proportion of related documents in the search results .

Measure the precision of retrieval results .

The concrete can be understood according to the confusion matrix ,


relevant Unrelated
return Real examples (tp) False positive example (fp)
Not returned False counterexample (fn) True counter example (tn)

Given the above matrix , Then the accuracy rate and recall rate can be calculated as follows :

Recall rate := tp / (tp + fn) * 100%;

Accuracy := tp / (tp + fp) * 100%

If it's not easy to understand , It's easy to explain :

  • Recall rate : How many samples have been recalled ( How many recalls ).

  • Accuracy : What you think is a positive sample , How many guesses are right ( How accurate is the guess ).

7 、 How to improve Elasticsearch Search experience

As mentioned earlier , Search five links linked together . The search experience is : Design 、 front end 、 Back end 、 Decision makers 、 Things that management has to think about , It can't be simply understood as a technical problem .

This article is only for Elasticsearch The back-end technology implementation level to do the next interpretation :

7.1 Choose the appropriate word breaker according to the business scenario

Be careful , There's no best word breaker 、 There is no universal word breaker for all business scenarios , We need to choose the best combination of business scenarios .

  • If fine granularity is required , As long as there is a recall , that ngram A participle is suitable for or 7.9+ The new wildcard Data type is preferred .

  • We need to make a comparison in advance , To verify whether different word breakers meet the business requirements . Chinese choice :IK、 stammer 、ansj Or other .

Cut words and contrast the core API :analyzer We should learn and use it flexibly .

POST _analyze
{
  "text":" Provide the world's outstanding cloud computing services _ Help enterprises go to the cloud without worry ",
  "analyzer": "ik_smart"
}
  • The selection IK, Distinguish :ik_smar And ik_max_word.

ik_smart It's a coarse-grained participle ( Return as little as possible , Approximate fit artificial segmentation );

ik_max_word It's a fine-grained participle ( Back as much as possible ).

7.2 Pay attention to the selection and updating of dictionaries

“ one can't make bricks without straw “,“ housewife “ It's a word breaker , The dictionary is “ rice ”.

The word breaker is so powerful , There is no dictionary .

therefore , A good choice of dictionaries , The more accurate the participle is .

Suggest : On the premise that the basic lexicon is relatively complete , You need to add your own industry thesaurus combined with business scenarios 、 Domain thesaurus, etc .

Even with the addition of industries 、 Domain dictionary , How to deal with incomplete new words ?

such as : New Internet vocabulary 、 The word segmentation is not correct because the vocabulary in the industry field is not comprehensive , What to do with poor user experience ?

Because the word breaker acts as a plug-in , Once the original dictionary is configured , Dynamic update is not supported , It needs to be realized with the help of a third-party mechanism .

such as :IK The implementation mechanism of dynamic updating dictionary : Combined with modification IK Word breaker source code + Dynamic update mysql The entry achieves the purpose of updating the dictionary .

7.3 Attaches great importance to Mapping Link data modeling

  • text Type of fielddata It's a big memory consumer , Unless necessary , It is not recommended to turn on .

  • Whether it is enabled depends on whether sorting or aggregation is required keyword type .

  • Fields that don't need to be indexed ,“index” Set to “false”.

  • Fields that don't need to be stored ,“store” Set to “false”.

  • Large text such as word,pdf Text information , Consider cutting into small pieces and storing them .

7.4 According to business scenarios , Choose the right search type

As mentioned earlier :match and match_phrase Different scenarios 、 Degree awarded .

  • match The answer is : The recall rate is high , The recall rate is high but the accuracy is low .

  • match_phrase It is : phrase match , High precision 、 Low recall rate .

  • wildcard Fuzzy matching , Unless necessary , Not recommended .

Of course , There are other types of retrieval , Such as :query_string, fuzzy etc. , The choice needs to be made in combination with the business scenario .

7.5 Pursue the ultimate response speed , Make trade-offs

The user's patience time is very limited , Don't let users wait .

  • Increase the ratio of data node memory and heap memory

  • _source Fields are unnecessary and do not return

  • Do not do complex business processing in the retrieval return phase

Including but not limited to :

1) More than double polymerization

2)wildcard perhaps regex Regular search

3) Custom highlight

  • Highlight the implementation, select the type according to the business type

Be careful : When file >1MB( A large file ) When , Especially suitable  fvh  The highlighted The way .

  • Do business trade-offs well

such as : Default from,size Deep paging 10000 That's enough , If the product manager doesn't agree , We need to discuss and convince .

such as : The inaccurate aggregation result is Elasticsearch Default mechanism , To accept or do other options ( such as :clickhouse), No details .

7.6 Use smart recommendations / Matching mechanism

  • Simple search box recommendation implementation Can use :prefix Prefix search implementation .

GET kibana_sample_data_ecommerce/_search
{
  "_source": "customer_full_name",
  "query": {
    "prefix": {
      "customer_full_name.keyword": "Ed"
    }
  }
}
  • The recommended implementation of complex points , Need auxiliary error correction function , With the help of :Suggester Realization .

POST /blogs/_search
{
  "suggest": {
    "my-suggestion": {
      "text": "lucne rock",
      "term": {
        "suggest_mode": "missing",
        "field": "body"
      }
    }
  }
}

The space for , About Suggester Reading ,

recommend :wood Uncle's article :

https://elasticsearch.cn/article/142

  • More complicated , Need user behavior recognition + Recommendation engine mechanism implementation .

A good recommendation engine tends to personalized recommendation , It can collect valuable digital footprints of users ( Such as : Demography 、 Business details 、 Interactive logs 、 Purchase record 、 Transaction records 、 Browse record ) And information about the product ( for example : specifications 、 User feedback 、 Compared with other products, etc ), To complete the data analysis before the recommendation .

8、 Summary

The search experience determines the user experience , User experience determines the user rate of the product, and then determines the success or failure of the product .

Mr. Liang Ning, a famous product man, is in 《 Product thinking 30 speak 》 mention “ We see a lot of new Internet companies , System capability is not as good as traditional enterprise , But we can snatch a large number of users from traditional enterprises , It's about the user experience . In the case of such a large volume difference , User experience can be the core competitiveness ; When competing with dimensions , User experience is the core competitiveness ”.

Search is a traffic portal , yes “ Strategists “( various APP、 Website ) User experience is the place to fight .

There is no end to the iteration of the search experience , How much research 、 You can't do it too carefully .

We have good ideas and suggestions. We also welcome to exchange views .

Reference resources :

  • http://www.woshipm.com/ucd/1037490.html

  • https://zhuanlan.zhihu.com/p/60826371

  • https://www.jianshu.com/p/677742838595

  • http://www.chanpin100.com/article/103633

  • https://www.uisdc.com/search-experience-process-summary

  • http://www.oreilly.com.cn/radar/?p=28

  • 《 Make your own recommendation engine 》


recommend :

dried food | Elasticsearch Develop a list of commonly used commands in actual combat

dried food | Elasticsearch Best practice guide for developers

Elasticsearch Develop the core of operation and maintenance Tips

dried food | On Elasticsearch The importance of data modeling

dried food | Elasticsearch Index design practical guide

dried food | Elasticsearch Multi table Association Design Guide

more short time more Learn quickly more More dry !

China 40%+Elastic Certified engineers come from !

版权声明
本文为[Mingyi world]所创,转载请带上原文链接,感谢

Scroll to Top