
Elasticsearchã§ãããã¹ãŒããŒã顿ã«ãã€ããªããæ€çŽ¢ã詊ããŠã¿ã
â»ãã®èšäºã¯èªåãæå±ããçµç¹ã§æžãã以äžã®èšäºã®ã³ããŒã§ããæçš¿ããèšäºã¯å人ã®èäœç©ãšããŠèªããã°ã«ã³ããŒããŠè¯ãã«ãŒã«ãšããŠããŸãã
å èšäº: https://tech-blog.mitsucari.com/entry/2026/02/12/163053
ããã«ã¡ã¯ãããã«ãªCTOã®å¡æ¬ããšãã€ãã³ãŒ(@tsukaby0) ã§ãã
å æ¥å人ãã¡ãšæ°å¹ŽäŒãããŠããã®ã§ãããããã§æ€çŽ¢æè¡ã«é¢ãã話ãå°ãããŸããã
ç§ã¯é倿Œ¢ãªã®ã§ãããå人ã¯å€§æäŒæ¥ã§æ€çŽ¢ãå°éã«ããŠããã¹ãã·ã£ãªã¹ãã§ããããã§ããŒã¯ãŒãæ€çŽ¢ãšã»ãã³ãã£ãã¯æ€çŽ¢ã®ãã€ããªããæ€çŽ¢ãè¡ã£ãŠãããšãã話ãèããŸãããèå³ãããé åã§ãããElasticsearchçã¯è§Šããæ©äŒãå°ãªãã®ã§ãä»åã¯ãã€ããªããæ€çŽ¢ã«ææŠããŠã¿ãããšã«ããŸãã
ä»åã®èšäºã§ã¯ããŒã¿ã»ãããElasticsearchã«æå ¥ããããŒã¯ãŒãæ€çŽ¢ã®ã¿ãã»ãã³ãã£ãã¯æ€çŽ¢ã®ã¿ããã€ããªããæ€çŽ¢ãªã©ã®è€æ°ææ³ã®ç²ŸåºŠãæ¯èŒããŠã¿ãããšæããŸãã
æ€çŽ¢ææ³ã®è§£èª¬
å®éšã«å ¥ãåã«ãä»åäœ¿ãæ€çŽ¢ææ³ã®çšèªãšãããããã®åŸæã»äžåŸæãæŽçããŠãããŸãã
ããŒã¯ãŒãæ€çŽ¢ïŒBM25ïŒãšã¯
Elasticsearchã®ããã©ã«ãã®ã¹ã³ã¢ãªã³ã°ã¢ã«ãŽãªãºã ã¯Okapi BM25ã§ãã
BM25ã®ããŒã¹ãšãªãã¢ã«ãŽãªãºã ã«TF-IDFããããŸããããã¯ã¯ãšãªã«å«ãŸããåèªãããã¥ã¡ã³ãå ã«äœååºçŸãããïŒTF: Term FrequencyïŒãšããã®åèªãã³ãŒãã¹å šäœã§ã©ãã ãçãããïŒIDF: Inverse Document FrequencyïŒãçµã¿åãããŠé¢é£åºŠã¹ã³ã¢ãèšç®ããŸãããããããã®ææ³ã¯1ã€ã®åèªãããã¥ã¡ã³ãå ã«äœåºŠãåºçŸããå Žåãæå©ã«ãªã£ãŠããŸããšããæ¬ ç¹ãæã£ãŠããŸããã(äžæåã®SEOã®ãã¯ã«åãåèªãäœåãææžäžã§äœ¿ããšããã®ããããŸããã§ããã£ãïŒ)
BM25ã¯ãã®ãããªäœåºŠãåºçŸããå Žåã§ãæå©ã«ãªããªããããªèª¿æŽããé·ãæç« ã¯åèªãå€ãå«ãã®ã§æå©ã«ãªããªããããªæ£èŠåãå ããããæ¹è¯çã®ã¢ã«ãŽãªãºã ã§ãã
BM25ã«ã€ããŠã¯ä»¥äžã®wm3ããã®èšäºãåèã«ãªããŸãã

BM25ã§ã¯åºæåè©ã®å®å šäžèŽ(äŸãã°ãã¢ãµãã¹ãŒããŒãã©ã€ã)ã屿§ã®çµã蟌ã¿(äŸãã°ãäœèèª çä¹³ãã§ããã®å Žåçä¹³ããäœèèªã®æ¹ãã¬ã¢åèªã§ããããããããã¹ã³ã¢çã«åªå ããã)ã¯åŸæã§ãã
éã«æå³çã«ã¯åãã ããåèªçã«ã¯å¥ã§ããã±ãŒã¹(äŸãã°ããã³ãšé¶è)ã¯èŠæã§ãã
ã»ãã³ãã£ãã¯æ€çŽ¢ïŒãã¯ãã«æ€çŽ¢ / kNNæ€çŽ¢ïŒãšã¯
ã»ãã³ãã£ãã¯ïŒsemanticïŒã¯ãæå³ã®ããšããæå³ã®è±åèªã§ããã€ãŸãã»ãã³ãã£ãã¯æ€çŽ¢ãšã¯ãæååã»ããŒã¯ãŒãã®äžèŽã§ã¯ãªããæå³ãã«åºã¥ããŠæ€çŽ¢ããææ³ã§ãã
å
·äœçã«ã¯ãããã¹ããembeddingã¢ãã«ã§æ°å€ãã¯ãã«ã«å€æãããã¯ãã«å士ã®é¡äŒŒåºŠã§æ€çŽ¢ããŸããElasticsearchã§ã¯dense_vectorãã£ãŒã«ãã«ãã¯ãã«ãæ ŒçŽããkNNæ€çŽ¢ãè¡ããŸãã
ããæ¹ã«ã€ããŠã¯ä»¥äžã®èšäºãåèã«ãªããšæããŸãã


BM25ããåãåèªãå«ãŸããŠãããããèŠãã®ã«å¯Ÿããã»ãã³ãã£ãã¯æ€çŽ¢ã¯ãæå³çã«è¿ããããèŠãŸããããã«ãããèªåœã®ãã¹ãããåé¡ãå€§å¹ ã«ç·©åã§ããŸããçè«äžã¯ãããã³ãâãé¶èãããæå³ããŒã¹ã®æ€çŽ¢ã§ãããã«ã¬ãŒ ææãâããããããã«ããããã«ã¬ãŒã«ãŒãšãã£ãæ€çŽ¢ãå¯èœã«ãªããŸãïŒãã ããembeddingã¢ãã«ã®å質ã«å€§ããäŸåããŸããããã«ã€ããŠã¯åŸè¿°ã®å®éšã§æ€èšŒããŸãïŒã
ããŒã¯ãŒãæ€çŽ¢ã®äžäœäºæã®ããã«ãæããŸãããèŠæãªæ€çŽ¢ããããŸããäŸãã°åºæåè©ã®æ£ç¢ºãªãããã§ãããã¢ãµãã¹ãŒããŒãã©ã€ããšæ€çŽ¢ããŠããæå³çã«è¿ããããªã³äžçªæŸãããããµã³ããªãŒãã¬ãã¢ã ã¢ã«ãããæ··ããå¯èœæ§ããããŸããä»ã«ããäœèèª çä¹³ããšæ€çŽ¢ããŠãããã¯ãã«ç©ºéäžã§ã¯æ®éã®çä¹³ãšäœèèªçä¹³ã®è·é¢ãè¿ããããæ®éã®çä¹³ãäžäœã«æ¥ãããšããããŸãã
BM25ãšéã£ãŠãã¯ãã«èšç®ãšãããªãŒããŒãããããããŸãã
ãã€ããªããæ€çŽ¢ïŒRRFïŒãšã¯
ãã€ããªããæ€çŽ¢ã¯ãããŒã¯ãŒãæ€çŽ¢ãšã»ãã³ãã£ãã¯æ€çŽ¢ãçµã¿åãããŠãäž¡æ¹ã®åŒ·ã¿ã掻ããã¢ãããŒãã§ãã
Elastic瀟ãè¯ã解説èšäºãåºããŠãããŠããã®ã§è©³çްã¯ãã¡ããèªããšè¯ããšæããŸãã

Elasticsearchã§ã¯ãRRFïŒReciprocal Rank FusionïŒãšããã©ã³ãã³ã°çµ±åææ³ãå©çšã§ããŸãã
RRFã¯ã¹ã³ã¢ã®å€ãã®ãã®ã§ã¯ãªããåæ€çŽ¢ææ³ã§ã®é äœïŒã©ã³ãã³ã°ïŒã«åºã¥ããŠçµ±åãè¡ããŸãã
RRFã«ã€ããŠã¯OpenSearchçã§ããã以äžã®ç¿»èš³ãããèšäºãåèã«ãªããšæããŸãã

Elastic瀟ã®ããã°ã«ãããšãRRFãçšãããã€ããªããæ€çŽ¢ã¯BM25åäœãšæ¯ã¹ãŠnDCG@10ã18%åäžãããšããå ±åããããŸãã
Reciprocal Rank Fusion increases average NDCG@10 by 1.4% over Elastic Learned Sparse Encoder alone and 18% over BM25 alone. åŒçš: https://www.elastic.co/search-labs/jp/blog/improving-information-retrieval-elastic-stack-hybrid
nDCG@10ã¯æ€çŽ¢ç²ŸåºŠã®æšæºçãªè©äŸ¡ææšã§ãäžäœ10ä»¶ã®æ€çŽ¢çµæãã©ãã ãæ£è§£ãšäžèŽããŠãããã0ã1ã®ã¹ã³ã¢ã§è¡šããŸããäžäœã«æ£è§£ãå€ãã»ã©ã¹ã³ã¢ãé«ããªããããã©ã³ãã³ã°ã®è³ªã枬ãã®ã«é©ããŠããŸãã
ãªããnDCG@10ã¯è©äŸ¡ææšã§ãã£ãŠãæ£è§£ã¯ããããèªèº«ã§å®çŸ©ã»çšæããå¿ èŠããããŸãã
ãã€ããªããæ€çŽ¢ã®å©ç¹ã¯ããŒã¯ãŒãæ€çŽ¢ãšã»ãã³ãã£ãã¯æ€çŽ¢ã®ãããšãã©ããã§ãããšããç¹ã§ãã
å®éš
ãããŸã§ã®æµãã§æ€çŽ¢ææ³ã¯äžéãäºç¿ã§ããŸãããããŒã¯ãŒãæ€çŽ¢ããã¯ãã«æ€çŽ¢ãããããããã³ã³ãããã®ã§ãã©ã¡ããã ãã§ã¯äžååãªçµæã«ãªããäž¡æ¹ãçµã¿åããããã€ããªããæ€çŽ¢ã§ã¯ç²ŸåºŠãäžããããã§ãã
ããããã¯å®éã«ãã¹ãããŒã¿ãçšæããŠããã®ããŒã¿ã«å¯ŸããŠããããã®æ€çŽ¢ãè¡ãããšã§ç²ŸåºŠãæ€èšŒããŠã¿ãããšæããŸãã
æ€çŽ¢ç²ŸåºŠã®è©äŸ¡ã«ã¯ãæ¬æ¥ã§ããã°åè¿°ã®nDCG@10ãšãã£ãæ å ±æ€çŽ¢ã®æšæºçãªææšã䜿ãã人æã§äœæããæ£è§£ããŒã¿ãšæ¯èŒããã®ãæ£æ»æ³ã§ããããããæ£è§£ããŒã¿ã®äœæã«ã¯ãã¡ã€ã³ç¥èãæã€äººéãã¯ãšãªããšã«é¢é£åºŠãæ€èšããå¿ èŠããããããªãã®æéãããããŸãã
ä»åã¯ç°¡ç¥åã®ããã8ã€ã®ãã¹ãã¯ãšãªã«å¯ŸããŠåæ€çŽ¢ææ³ã®Top 5çµæãç®èŠã§æ¯èŒãããæåŸ ããååãäžäœã«åºãŠããããããã€ãºïŒç¡é¢ä¿ãªååïŒãæ··ãã£ãŠããªãããã宿§çã«è©äŸ¡ããŸããå³å¯ãªãã³ãããŒã¯ã§ã¯ãããŸããããåææ³ã®åŸæã»äžåŸæã®åŸåãæŽãã«ã¯ååã§ãããŸãããã€ããªããæ€çŽ¢ã詊ãããšãç®çãªã®ã§ã粟床ã¯äºã®æ¬¡ãšããŸãã
ä»åå©çšããããŒã¿ã»ãã
ä»åã¯çæAIã«ãã£ãŠç¬èªã«ããŒã¿ãçšæããŸãã

ãã¡ãã®rejasupotaroããã玹ä»ãããŠããAmazon ESCIããŒã¿ã¯ããªãé åçã§ãããECãµã€ãã®ããŒã¿ã¯ããªãçš®é¡ãå€ããããä»åã¯äœ¿ããŸããã
LLMã®åãåããŠããã£ãœããããã¹ãŒããŒã®ããŒã¿ãçšæããŠã¿ãŸãããããã¹ãŒããŒã顿ã«ããçç±ã¯ç§ãããå©çšããããã§ãããŸãã身è¿ãªååã§ãããECããçš®é¡ãå°ãªãããã§ãããªãŒãã³ãªæ¥æ¬ã®ååããŸãšããããŒã¿ã¯ç¡ããããªã®ã§ãèªåã§çšæãããããªãã§ããã顿ãšããŠã¯è¯ãããªãšæã£ãŠããŸãã
ããã³ãã
ãããã¹ãŒããŒã®ååããŒã¿ã{category}ã«ããŽãªã«ã€ããŠ100ä»¶çæããŠãã ããã
西åãããã¹ãŒããŒã®å®éã®ã«ããŽãªæ§æã«åºã¥ãã以äžã®ã«ããŽãªããšã«ãã®ããã³ãããå®è¡ããŸãã
- éè / æç© / ãè / ãé / ãæ£èã»ãåŒåœ / ãã ã»ãœãŒã»ãŒãžã»ãã«ã調çå
- åµã»çä¹³ã»ä¹³è£œå / è±è
ã»çŽè±ã»æŒ¬ç©ã»ç·Žç© / å·åé£åã»ã¢ã€ã¹
- ãç±³ã»éººã»ãã¹ã¿ / ãã³ã»ãžã£ã ã»ã·ãªã¢ã« / 飿²¹ã»ã«ã¬ãŒã»ã¹ãŒãã»èª¿å³æ
- 猶詰ã»ç²é¡ã»ä¹Ÿç© / ãèåã»ã¹ã€ãŒã / 飲æã»ãæ°Ž / ãé
ã»ãã³ã¢ã«ã³ãŒã«
- çŽã»çççšåã»ä»è· / çŸå®¹ã»è¡ç / æ¥çšåã»é貚 / ãããã³çšå / ãã㌠/ ããã
åååã«ã¯ä»¥äžã®ãã£ãŒã«ããå«ããŠãã ããã
- product_id: "{category_prefix}-001"ã"{category_prefix}-100" ã®åœ¢åŒ
- product_name: åååïŒäŸ: "åæµ·éç£ ç¹éžãŽãŒãããŒãº 200g"ïŒ
- category: ã«ããŽãªå
- price: äŸ¡æ ŒïŒæŽæ°ãååäœïŒ
- description: ååã®èª¬ææïŒ50ã100æåçšåºŠãåææã»ç£å°ã ãã§ãªããããããã®é£ã¹æ¹ãçšéãå«ããããšïŒ
- tags: é¢é£ã¿ã°ã®é
åïŒäŸ: ["ããŒãº", "åæµ·é", "ãã€ãŸã¿"]ïŒ
ããŒã¿ã®å€æ§æ§ã«ã€ããŠã以äžã®ç¹ãæèããŠãã ããã
- åã飿ã§ãè€æ°ã®è¡šçŸã䜿ãïŒäŸ: é¶è/ããã³ãããã/ããããã/ããããããçä¹³/ãã«ã¯ïŒ
- 説ææã«ã¯ãã®ååã䜿ãããæçåãã·ãŒã³ãå«ããïŒäŸ: "ã«ã¬ãŒã®å
·æã«æé©" "ãåŒåœã®ãããã«ãŽã£ãã"ïŒ
- ãã©ã³ãåãç£å°åããªã¢ã«ã«å
¥ãã
JSONL圢åŒïŒ1è¡1ãªããžã§ã¯ãïŒã§åºåããŠãã ããã
çµæ§æéãããããŸããã15åãããã§ããããããã®ããã°ã«ã¯jsonlã¯æ·»ä»ã§ããªãã®ã§ãã¡ã€ã«ã¯å²æããŸãã
ãã¹ãã¯ãšãª
å šã¹ããããéããŠã以äžã®8ã€ã®ã¯ãšãªã§æ€çŽ¢ç²ŸåºŠãæ¯èŒããŸãã
| # | ã¯ãšãª | ã¿ã€ã | çã |
|---|---|---|---|
| Q1 | ããã | åçŽããŒã¯ãŒã | ããŒã¹ã©ã€ã³ãã©ã®ã¹ãããã§ããããããã¯ããã»ãã³ãã£ãã¯ã ãšããäžå©ïŒ |
| Q2 | ãã³ãžã³ | 衚èšãã | ããŒã¿ã«ã¯ãã«ããããïŒã²ãããª, åèçç£ïŒãšããã³ãžã³ãïŒã«ã¿ã«ã, èšåçç£ïŒãæ··åšãBM25ã§ã¯çæ¹(èšåçç£)ãããããããªãïŒ |
| Q3 | ããã³ | åçŸ©èª | åååã¯ãé¶èãã ããããã³ãã§æ€çŽ¢ãã人ã¯ããã¯ããããŒã¿ãšããŠã¯ã©ã¡ãããããBM25ã ãšé¶èã¯ãŸããããããªãã ãã |
| Q4 | ã·ãŒããã³ | ä¿ç§° | ããŒã¿ã«ã¯ããã猶ããããªããèŸæžã«ãªãä¿ç§°ãã»ãã³ãã£ãã¯ã§æŸããããã·ãŒããã³ã¯åæšãªã®ã§å°ãå³ãããïŒ |
| Q5 | ã«ã¬ãŒ ææ | çšé | descriptionã«ãã«ã¬ãŒããšæžããããããããã»ã«ãããã»ã«ã¬ãŒç²ãªã©ãäžäœã«æ¥ããïŒã»ãã³ãã£ãã¯æ€çŽ¢ã§ããããã |
| Q6 | å³åæ±ã®å
· | çšé | è±è ã»ãããã»ã»ãããèã»ããããªã©ãdescriptionã«ãå³åæ±ããå«ãååãæšªæçã«æŸããã |
| Q7 | äœèèª çä¹³ | 屿§ä»ã | ãå°å²©äº çä¹³äœèèªãããã³ãã€ã³ãã§äžäœã«æ¥ãã |
| Q8 | ã¢ãµãã¹ãŒããŒãã©ã€ | åºæåè© | å®å šäžèŽãBM25ãæãåŸæãšãããã¿ãŒã³ |
ã»ããã¢ãã
äœåºŠã§ãããçŽããå¹ãããã«ãæ€çŽ¢ãã¿ãŒã³ããšã«ã€ã³ããã¯ã¹ãåããŠå šããŒã¿ãæå ¥ããŠãããŸãã1ã€ã®Elasticsearchã³ã³ããå ã«3ã€ã®ã€ã³ããã¯ã¹ãå ±åãããŸãã
| ã€ã³ããã¯ã¹å | çšé | ã¢ãã©ã€ã¶ | ãã¯ãã« |
|---|---|---|---|
products_default | Step 1: BM25ããŒã¹ã©ã€ã³ | StandardïŒããã©ã«ãïŒ | ãªã |
products_kuromoji | Step 2: æ¥æ¬èªå¯Ÿå¿BM25 | kuromoji + synonym | ãªã |
products_vector | Step 3 & 4: ãã¯ãã«æ€çŽ¢ / ãã€ããªãã | kuromoji + synonym | ãã |
Step 3ïŒkNNã®ã¿ïŒãšStep 4ïŒRRFïŒã¯æ€çŽ¢æ¹æ³ãéãã ããªã®ã§ãåãã€ã³ããã¯ã¹ã䜿ããŸãã
ã»ããã¢ããã¯2段éã§ãããŸããã¯ãã«ã®äºåçæïŒPythonïŒã次ã«Elasticsearchã®èµ·åãšããŒã¿æå ¥ïŒbashïŒãè¡ããŸãã
ãã¯ãã«çæ
Step 3ã»4ã®ãã¯ãã«æ€çŽ¢ã§ã¯ãååããã¹ãã®ãæå³ããæ°å€ãã¯ãã«ã«å€æããŠé¡äŒŒåºŠã§æ€çŽ¢ããŸããBM25ãããŒã¯ãŒãã®äžèŽãèŠãã®ã«å¯Ÿãããã¯ãã«æ€çŽ¢ã¯ãããã³ããšãé¶èãããã«ã¬ãŒ ææããšãããããããã®ãããªãèšèã¯éããæå³çã«è¿ãé¢ä¿ãæããããŸãã
ãã®ãã¯ãã«åã«ã¯embeddingã¢ãã«ãå¿ èŠã§ãElasticsearchã«æå ¥ããåã«äºåèšç®ããŠãããŸããããããŠããã¹ãã¯ãšãªã®ãã¯ãã«ãçæããŠããããšã§ãå®éšã¹ã¯ãªãããbashã ãã§å®è¡ã§ããããã«ããŸãã
mkdir es_workspace
cd es_workspace
# cp äºåã«çæAIã§äœæããååããŒã¿(all_products.jsonl)ãã³ããŒããŠãã
pip install sentence-transformers
embed_products.py ãšããŠä»¥äžã®ãã¡ã€ã«ãä¿åã
import json
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("intfloat/multilingual-e5-small")
# --- ååããŒã¿ã®ãã¯ãã«å ---
with open("all_products.jsonl", "r") as f:
products = [json.loads(line) for line in f]
for p in products:
text = f"{p['product_name']} {p['description']}"
p["embedding"] = model.encode(f"passage: {text}").tolist()
with open("all_products_with_embeddings.jsonl", "w") as f:
for p in products:
f.write(json.dumps(p, ensure_ascii=False) + "\n")
print(f"ååãã¯ãã«åå®äº: {len(products)}ä»¶, 次å
æ°: {len(products[0]['embedding'])}")
# --- ãã¹ãã¯ãšãªã®ãã¯ãã«äºåçæ ---
queries = [
"ããã", "ãã³ãžã³", "ããã³", "ã·ãŒããã³",
"ã«ã¬ãŒ ææ", "å³åæ±ã®å
·", "äœèèª çä¹³", "ã¢ãµãã¹ãŒããŒãã©ã€",
]
query_vectors = {}
for q in queries:
query_vectors[q] = model.encode(f"query: {q}").tolist()
with open("query_vectors.json", "w") as f:
json.dump(query_vectors, f, ensure_ascii=False)
print(f"ã¯ãšãªãã¯ãã«çæå®äº: {len(queries)}ä»¶")
python embed_products.py ã§å®è¡ããŸããç§ã®M4 Macbook Airã§1åã»ã©ã§ããã
ååãã¯ãã«åå®äº: 2200ä»¶, 次å
æ°: 384
ã¯ãšãªãã¯ãã«çæå®äº: 8ä»¶
æåããããã§ãã
ESèµ·åãšã€ã³ããã¯ã¹äœæãããŒã¿æå ¥
次ã¯ESãèµ·åããŠã€ã³ããã¯ã¹ãäœã£ãŠãããŸãã
# Elasticsearchèµ·å
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
elasticsearch:9.3.0
# èµ·åå®äºãŸã§åŸ
æ©ïŒæ°åç§ããããŸãïŒ
until curl -s "http://localhost:9200" > /dev/null 2>&1; do sleep 2; done
# kuromojiãã©ã°ã€ã³ã€ã³ã¹ããŒã«
docker exec elasticsearch elasticsearch-plugin install analysis-kuromoji
docker restart elasticsearch
# åèµ·åå®äºãŸã§åŸ
æ©
until curl -s "http://localhost:9200" > /dev/null 2>&1; do sleep 2; done
# ãã©ã€ã¢ã«ã©ã€ã»ã³ã¹ã®æå¹å
curl -X POST "http://localhost:9200/_license/start_trial?acknowledge=true&pretty"
Step 4ã§äœ¿ãRRFïŒReciprocal Rank FusionïŒã¯Elasticsearchã®æåæ©èœïŒEnterpriseçžåœïŒã«å«ãŸããŠãããããã©ã«ãã®Basicã©ã€ã»ã³ã¹ã§ã¯å©çšã§ããŸããã
äžèšã®APIãå©ããšãã¡ãŒã«ã¢ãã¬ã¹ãªã©ã®ç»é²ãªãã§30æ¥éã®ãã©ã€ã¢ã«ãéå§ãããRRFãå«ããã¹ãŠã®æåæ©èœã䜿ããããã«ãªããŸãããªãããã©ã€ã¢ã«ã¯åäžã¯ã©ã¹ã¿ã§ã¡ãžã£ãŒããŒãžã§ã³ããšã«1åéããªã®ã§æ³šæããŠãã ããã以äžã®URLãã芧ãã ããã

# Step 1çš: Standard AnalyzerïŒããã©ã«ãïŒ
curl -s -X PUT "http://localhost:9200/products_default" -H "Content-Type: application/json" -d '{
"mappings": {
"properties": {
"product_id": { "type": "keyword" },
"product_name": { "type": "text" },
"category": { "type": "keyword" },
"price": { "type": "integer" },
"description": { "type": "text" },
"tags": { "type": "keyword" }
}
}
}'
# Step 2çš: kuromoji + synonym
curl -s -X PUT "http://localhost:9200/products_kuromoji" -H "Content-Type: application/json" -d '{
"settings": {
"analysis": {
"tokenizer": {
"kuromoji_tokenizer": { "type": "kuromoji_tokenizer", "mode": "search" }
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"é¶è,ããã³,ãšãè",
"è±è,ããŒã¯",
"çè,ããŒã",
"ããã,ããããã,ããããã",
"çä¹³,ãã«ã¯",
"ããããã,ããã,銬éŽè¯",
"ããŸãã,çãã,ãªããªã³"
]
}
},
"analyzer": {
"ja_analyzer": {
"type": "custom",
"tokenizer": "kuromoji_tokenizer",
"filter": ["kuromoji_baseform", "kuromoji_part_of_speech", "synonym_filter", "lowercase"]
}
}
}
},
"mappings": {
"properties": {
"product_id": { "type": "keyword" },
"product_name": { "type": "text", "analyzer": "ja_analyzer" },
"category": { "type": "keyword" },
"price": { "type": "integer" },
"description": { "type": "text", "analyzer": "ja_analyzer" },
"tags": { "type": "keyword" }
}
}
}'
# Step 3 & 4çš: kuromoji + synonym + dense_vector
curl -s -X PUT "http://localhost:9200/products_vector" -H "Content-Type: application/json" -d '{
"settings": {
"analysis": {
"tokenizer": {
"kuromoji_tokenizer": { "type": "kuromoji_tokenizer", "mode": "search" }
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"é¶è,ããã³,ãšãè",
"è±è,ããŒã¯",
"çè,ããŒã",
"ããã,ããããã,ããããã",
"çä¹³,ãã«ã¯",
"ããããã,ããã,銬éŽè¯",
"ããŸãã,çãã,ãªããªã³"
]
}
},
"analyzer": {
"ja_analyzer": {
"type": "custom",
"tokenizer": "kuromoji_tokenizer",
"filter": ["kuromoji_baseform", "kuromoji_part_of_speech", "synonym_filter", "lowercase"]
}
}
}
},
"mappings": {
"properties": {
"product_id": { "type": "keyword" },
"product_name": { "type": "text", "analyzer": "ja_analyzer" },
"category": { "type": "keyword" },
"price": { "type": "integer" },
"description": { "type": "text", "analyzer": "ja_analyzer" },
"tags": { "type": "keyword" },
"embedding": { "type": "dense_vector", "dims": 384, "index": true, "similarity": "cosine" }
}
}
}'
次ã¯ããŒã¿ãæå ¥ããŸãã
# products_default ãš products_kuromoji ã«ã¯åãJSONLãæå
¥
for INDEX in products_default products_kuromoji; do
jq -c '{"index": {"_index": "'"$INDEX"'", "_id": .product_id}}, .' all_products.jsonl \
| curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @- > /dev/null
done
# products_vector ã«ã¯embeddingä»ãããŒã¿ãæå
¥
jq -c '{"index": {"_index": "products_vector", "_id": .product_id}}, .' all_products_with_embeddings.jsonl \
| curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @- > /dev/null
å®éšã¹ã¯ãªããïŒrun.shïŒ
ã»ããã¢ãããå®äºãããã8ã¯ãšãª à 4ãã¿ãŒã³ãäžæ¬ã§å®è¡ããçµæã䞊ã¹ãŠæ¯èŒããŸãã
4ã€ã®æ€çŽ¢ãã¿ãŒã³ã¯ä»¥äžã®éãã§ãã
| Step | æ€çŽ¢ãã¿ãŒã³ | ã€ã³ããã¯ã¹ | äœãèŠãã |
|---|---|---|---|
| Step 1 | BM25ïŒããã©ã«ãïŒ | products_default | Standard Analyzerã®çŽ ã®ç¶æ ãæ¥æ¬èªããŒã¯ã³åãªã |
| Step 2 | BM25ïŒkuromoji + synonymïŒ | products_kuromoji | æ¥æ¬èªåœ¢æ çŽ è§£æãšå矩èªèŸæžã®å¹æ |
| Step 3 | kNNæ€çŽ¢ã®ã¿ | products_vector | ãã¯ãã«æ€çŽ¢åäœã®ç²ŸåºŠãBM25ã¯äœ¿ããªã |
| Step 4 | ãã€ããªããïŒRRFïŒ | products_vector | BM25 + kNNã®çµ±åãäž¡æ¹ã®åŒ·ã¿ã掻ãããã |
æåã§ããã®ã¯é¢åãªã®ã§ãAIã«ã¹ã¯ãªãããçšæããŠããããŸããã以äžã®ã¹ã¯ãªãããé©åœã« run.sh ãšããŠä¿åããŠå®è¡ããŸãã
#!/bin/bash
QUERIES=("ããã" "ãã³ãžã³" "ããã³" "ã·ãŒããã³" "ã«ã¬ãŒ ææ" "å³åæ±ã®å
·" "äœèèª çä¹³" "ã¢ãµãã¹ãŒããŒãã©ã€")
QUERY_VECTORS=$(cat query_vectors.json)
search_bm25() {
local index=$1 query=$2
curl -s "http://localhost:9200/${index}/_search" -H "Content-Type: application/json" -d '{
"query": { "multi_match": { "query": "'"$query"'", "fields": ["product_name", "description"] } },
"size": 5, "_source": ["product_name"]
}' | jq -r '.hits.hits[]._source.product_name'
}
search_knn() {
local query=$1
local vec=$(echo "$QUERY_VECTORS" | jq -c --arg q "$query" '.[$q]')
curl -s "http://localhost:9200/products_vector/_search" -H "Content-Type: application/json" -d '{
"knn": { "field": "embedding", "query_vector": '"$vec"', "k": 10, "num_candidates": 50 },
"size": 5, "_source": ["product_name"]
}' | jq -r '.hits.hits[]._source.product_name'
}
search_hybrid() {
local query=$1
local vec=$(echo "$QUERY_VECTORS" | jq -c --arg q "$query" '.[$q]')
curl -s "http://localhost:9200/products_vector/_search" -H "Content-Type: application/json" -d '{
"retriever": {
"rrf": {
"retrievers": [
{ "standard": { "query": { "multi_match": { "query": "'"$query"'", "fields": ["product_name", "description"] } } } },
{ "knn": { "field": "embedding", "query_vector": '"$vec"', "k": 10, "num_candidates": 50 } }
],
"rank_window_size": 50,
"rank_constant": 60
}
},
"size": 5, "_source": ["product_name"]
}' | jq -r '.hits.hits[]._source.product_name'
}
for query in "${QUERIES[@]}"; do
echo ""
echo "========================================"
echo "Q: $query"
echo "========================================"
echo "--- Step 1: BM25 (default) ---"
search_bm25 "products_default" "$query"
echo "--- Step 2: BM25 (kuromoji+synonym) ---"
search_bm25 "products_kuromoji" "$query"
echo "--- Step 3: kNN ---"
search_knn "$query"
echo "--- Step 4: Hybrid (RRF) ---"
search_hybrid "$query"
done
çµæ
run.sh ã®çµæã¯ä»¥äžã®ãããªåºåã«ãªããŸããã
========================================
Q: ããã
========================================
--- Step 1: BM25 (default) ---
é«ç¥ç£ ããã
é«ç¥ç£ ããã
--- Step 2: BM25 (kuromoji+synonym) ---
ãããããŒã¹ã
ãããããŒã¹ã
åèç£ ããããã
ãããã¹ãŒã ç²æ«
ãããã¹ãŒã ç²æ«
--- Step 3: kNN ---
ã«ããããã猶 400g
ã«ããããã猶 400g
ãããããŒã¹ã
ãããããŒã¹ã
é«ç¥ç£ ããã
--- Step 4: Hybrid (RRF) ---
ãããããŒã¹ã
ãããããŒã¹ã
ãããã¹ãŒã ç²æ«
ã«ããããã猶 400g
é«ç¥ç£ ããã
========================================
Q: ãã³ãžã³
========================================
--- Step 1: BM25 (default) ---
èšåç£ ãã³ãžã³
--- Step 2: BM25 (kuromoji+synonym) ---
èšåç£ ãã³ãžã³
--- Step 3: kNN ---
èšåç£ ãã³ãžã³
é«ç¥ç£ çå§
ãžã³ãžã£ãŒãšãŒã«
ãžã³ãžã£ãŒãšãŒã«
鿣®ç£ ã«ãã«ã
--- Step 4: Hybrid (RRF) ---
èšåç£ ãã³ãžã³
é«ç¥ç£ çå§
ãžã³ãžã£ãŒãšãŒã«
ãžã³ãžã£ãŒãšãŒã«
鿣®ç£ ã«ãã«ã
========================================
Q: ããã³
========================================
--- Step 1: BM25 (default) ---
--- Step 2: BM25 (kuromoji+synonym) ---
çæ¬ç£ é¶èããã¹ããŒã
çæ¬ç£ é¶èããã¹ããŒã
倧åç£ é¶èçŒãèçš
倧åç£ é¶èçŒãèçš
åæµ·éç£ é¶èããè
--- Step 3: kNN ---
ããŒã¹ãããã³
åœç£ é¶èç®
åœç£ ããã³ãã²ãã
ããŒã¹ãããã³
ããŒã¹ãããã³
--- Step 4: Hybrid (RRF) ---
åœç£ é¶èç®
åœç£ ããã³ãã²ãã
ããŒã¹ãããã³
å®®åŽç£ é¶è现åã
åœç£ é¶èæçŸœå
========================================
Q: ã·ãŒããã³
========================================
--- Step 1: BM25 (default) ---
--- Step 2: BM25 (kuromoji+synonym) ---
--- Step 3: kNN ---
ããŒã¹ãããã³
åœç£ é¶èç®
ããŒã¹ãããã³
ããŒã¹ãããã³
åœç£ é¶èæçŸœå
--- Step 4: Hybrid (RRF) ---
ããŒã¹ãããã³
åœç£ é¶èç®
ããŒã¹ãããã³
ããŒã¹ãããã³
åœç£ é¶èæçŸœå
========================================
Q: ã«ã¬ãŒ ææ
========================================
--- Step 1: BM25 (default) ---
ããŒã«ããã猶 400g
ã«ããããã猶 400g
ããŒã«ããã猶 400g
ã«ããããã猶 400g
é·éç£ ã¡ãŒã¯ã€ã³
--- Step 2: BM25 (kuromoji+synonym) ---
ããŒã¢ã³ãã«ã¬ãŒ çå£
ãŽãŒã«ãã³ã«ã¬ãŒ çå£
ãŽãŒã«ãã³ã«ã¬ãŒ èŸå£
ããŒã¢ã³ãã«ã¬ãŒ çå£
ãŽãŒã«ãã³ã«ã¬ãŒ çå£
--- Step 3: kNN ---
ããŒã«ããã猶 400g
ããŒã«ããã猶 400g
ã«ããããã猶 400g
ã«ããããã猶 400g
ã«ã¬ãŒãã³ ã¹ãã€ã·ãŒ
--- Step 4: Hybrid (RRF) ---
ãŽãŒã«ãã³ã«ã¬ãŒ çå£
ã«ã¬ãŒãã³ ã¹ãã€ã·ãŒ
ãŽãŒã«ãã³ã«ã¬ãŒ çå£
ã«ã¬ãŒãã³ ã¹ãã€ã·ãŒ
ã«ã¬ãŒãã¬ãŒã¯ çå£
========================================
Q: å³åæ±ã®å
·
========================================
--- Step 1: BM25 (default) ---
å³åæ±ã®çŽ å³åž
å³åæ±ã®çŽ å³åž
å³å çœã¿å³å 500g
å³å çœã¿å³å 500g
çžæš¡å± æ²¹æã èæã
--- Step 2: BM25 (kuromoji+synonym) ---
å³åæ±ã®çŽ å³åž
å³åæ±ã®çŽ å³åž
çžæš¡å± æ²¹æã èæã
çžæš¡å± æ²¹æã èæã
çžæš¡å± æ²¹æã èæã
--- Step 3: kNN ---
ããã«ã³ çŒãè±è
æ
ããã«ã³ çŒãè±è
æ
ããã«ã³ çŒãè±è
æ
ããã«ã³ çŒãè±è
æ
ããã«ã³ çŒãè±è
æ
--- Step 4: Hybrid (RRF) ---
ããã«ã³ çŒãè±è
æ
ããã«ã³ çŒãè±è
æ
çŽæ åæã çŒãåæã
ããã«ã³ çŒãè±è
æ
ããã«ã³ çŒãè±è
æ
========================================
Q: äœèèª çä¹³
========================================
--- Step 1: BM25 (default) ---
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
--- Step 2: BM25 (kuromoji+synonym) ---
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
--- Step 3: kNN ---
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
--- Step 4: Hybrid (RRF) ---
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
å°å²©äº çä¹³äœèèª
========================================
Q: ã¢ãµãã¹ãŒããŒãã©ã€
========================================
--- Step 1: BM25 (default) ---
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
--- Step 2: BM25 (kuromoji+synonym) ---
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµã ã¯ãªã¢ã¢ãµã 350ml猶
ã¢ãµã ã¯ãªã¢ã¢ãµã 350ml猶
ã¹ãŒããŒãã©ã€ãã©ã€ã»ãã³ 350ml猶
--- Step 3: kNN ---
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµã0.00 350ml猶
ã¢ãµã0.00 æ¢
350ml猶
ã¢ãµã ã¯ãªã¢ã¢ãµã 350ml猶
--- Step 4: Hybrid (RRF) ---
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµãã¹ãŒããŒãã©ã€ 350ml猶
ã¢ãµã ã¯ãªã¢ã¢ãµã 350ml猶
ã¢ãµã ã¯ãªã¢ã¢ãµã 350ml猶
ã¢ãµã0.00 350ml猶
â»LLMã§çæãããã¹ãããŒã¿ã«éè€ããããåãåååãè€æ°ä»¶ååšããŠããŸãïŒãé«ç¥ç£ ãããããå°å²©äº çä¹³äœèèªããããŒã¹ãããã³ããªã©ïŒããã®ããTop 5ã«åãåååãç¹°ãè¿ãåºãŠããŸãããæ€çŽ¢èªäœã¯æ£åžžã«åäœããŠããŸããæ¬æ¥ã§ããã°ããŒã¿ã®ã¯ã¬ã³ãžã³ã°ãè¡ãããElasticsearchã®collapseæ©èœã§éè€ãæé€ãã¹ãã§ãããä»åã¯ãã®ãŸãŸæ¯èŒããŠããŸãã
èå¯
ããã
BM25 (default) ã§ã¯é«ç¥ç£ã®ããããããããããŠããŸãããããã®ä»ã®æ€çŽ¢ã§ã¯ããããããããã猶ããããããŠããã®ã§è¯ãã§ãããç¹ã«ãã€ããªããæ€çŽ¢ã§ã¯ïŒçš®é¡ããããããŠããã®ã¯è¯ããšèšãããã§ããå®éã®ãããã¹ãŒããŒã§ã¯ãããããããã§æ€çŽ¢ãããããããè²·ãããã§ããããããé«ç¥ç£ããããäžçªäžã«æ¥ãŠã»ããæ°ã¯ããŸãã
ãã³ãžã³
æ®å¿µãªããkuromojiãå ¥ããŠããŸãããããã³ãžã³ããšãã«ããããã®è¡šèšæºããè§£æ¶ãããŠããŸãããåèçç£ã®ã«ãããã¯ãããããŸããã§ãããkNN以éã¯çå§ãã«ãã«ãããã¯ãã«çã«è¿ããšå€å®ãããŠããããã§ãããããã¯æ®éã«èãããšãã¡ã§ãããäžçªãã¡ãªã®ã¯ãã¹ãããŒã¿ã§ããã«ããããšããã¯ãŒããå«ããããªä»ã®ååããã£ããããžã¥ãŒã¹ãªã©ã®ããŒã¿ããªããããèŠèã®çãšããŠããããåºãŠããã®ã ãšæããŸããäžå¿éèãžã¥ãŒã¹ã¯ããã®ã§ãããéèãžã¥ãŒã¹ããã«ãã«ãããã¯ãã«çã«è¿ãã®ã¯ãŸãåãããã§ããªããšããæãã§ãã
ããã³
ããã¯çŽ æŽãããã§ãããã€ããªããæ€çŽ¢ãäžçªè¯ãçµæãåºããŠãããã§ãïŒ
ãŸãproducts_defaultã«ã¯synonymèŸæžãèšå®ãããŠããªãã®ã§BM25ïŒdefaultïŒã0ä»¶ãªã®ã¯ä»æ¹ãªãã§ããããã®åŸsynonymä»ãã®BM25ã§ãã ãã¶ãã·ã§ãã
ããããããã«kNNã§ããŒã¹ãããã³ãããã³ãã²ãããåºãŠããã®ã¯çŽ æŽãããã§ããæ®éã¯ãããã¹ãŒããŒã§ããã³ãšæ€çŽ¢ãããçèãšããããã¯å å·¥ãããé³¥é£åãè²·ããããããªäººãå€ãæ°ãããŸããããã¯ã»ãã³ãã£ãã¯æ€çŽ¢ã®åŒ·ã¿ãåºãŸããã
æåŸã«ãã€ããªããæ€çŽ¢ã§ããããã³ã§æ€çŽ¢ããŠé¶ç®ãè²·ããã人ã¯ããŸãããªããããªæ°ã¯ããŸãããå¹ åºããããããŠããã®ã¯çŽ æŽããããšæããŸãã
ã·ãŒããã³
æµç³ã«ç¡çã§ããïŒBM25ã§ã¯åœç¶0ä»¶ã§ãããkNNããã€ããªããããããŒã¹ãããã³ããé¶èç®ããªã©ããã³ïŒé¶èïŒç³»ã®ååã°ãããè¿ã£ãŠããŠããããã猶ã«ã¯ãã©ãçããŠããŸããã
ãªããã¯ãã«æ€çŽ¢ã§ãæŸããªãã£ãã®ã§ãããããä»å䜿ã£ãembeddingã¢ãã«ïŒintfloat/multilingual-e5-smallïŒã¯å€èšèªå¯Ÿå¿ã®æ±çšã¢ãã«ã§ããããã·ãŒããã³ããæ¥æ¬ã§ããã猶ããæãåååïŒã¯ãããããŒãºã®ç»é²åæšïŒã§ãããšããç¥èãæã£ãŠããŸãããã¢ãã«ã«ãšã£ãŠã¯ãã·ãŒããã³ãã¯åçŽã«ãããã³ããå«ãè€åèªã§ãããé¶èæ¹é¢ã«ãã¯ãã«ãå¯ã£ãŠããŸã£ããšèããããŸãã
æ¹åãããšããããsynonymèŸæžã«ãã·ãŒããã³,ãã猶ãã远å ããã®ãæãæè»œã§ãããã ããä¿ç§°ãåæšãsynonymã§ç¶²çŸ ããã®ã¯çŸå®çã§ã¯ãªãã®ã§ãæ ¹æ¬çã«ã¯ãã¡ã€ã³ç¹åã®embeddingã¢ãã«ã䜿ãããããã¯æ±çšã¢ãã«ãæ¥æ¬ã®é£åããŒã¿ã§ãã¡ã€ã³ãã¥ãŒãã³ã°ãããšãã£ãã¢ãããŒããå¿ èŠã«ãªãããã§ãã
ã«ã¬ãŒ ææ
ããã¯åŸ®åŠãªçµæã«ãªããŸãããBM25 (kuromoji+synonym)ãäžçªè¯ãçµæãããããŸããã
ããŒã«ããã猶ãªã©ã¯descriptionã«ã«ã¬ãŒãšããåèªãããã®ã§ããã§ãããããã®ã§ããããããã¯è¯ããšæããŸããã«ã¬ãŒãã³ã¯ãã¯ãã«ãšããŠã¯ãããªã®ã§ãããææã§ã¯ãªãã§ããã
ã«ã¬ãŒãšããåèªã䜿ãããŠããéèãªã©ã¯ãã£ãšè²ã ããããããããããããããŠã»ãããšããã§ãããå®éã«ã¯ããããã¯ãšãªã§ãããã¹ãŒããŒã䜿ãã±ãŒã¹ã¯ã»ãŒãªããšæããŸãããã«ã¬ãŒã®é£æãå šãŠèŠããŠããªãã±ãŒã¹ããåååããå¿ãããŠããã£ãœãèšãæããåèªã§æ€çŽ¢ããã±ãŒã¹ãªã©ã¯èããããŸãããå Žåã«ãã£ãŠã¯å¿ èŠã«æããŸãã
ã¡ãªã¿ã«ãã«ã¬ãŒ ææãã§æ€çŽ¢ãããšè¥¿åã®ãããã¹ãŒããŒã§ã¯å ·æã¯ãããããŸãããOKã¹ãã¢ã®ãããã¹ãŒããŒã§ã¯å€æ°ããããããã®ã®ãã»ãšãã©ã¯ã«ã¬ãŒã«ãŒãã¹ãã€ã¹ã§ãããã«ããããæåŸã®æ¹ã«äžã€ã ããããããŠããŸããããã®èŸºãã®æ€çŽ¢ç²ŸåºŠã¯ãŸã ãŸã ã®ããã§ãã
å³åæ±ã®å ·
岿ããŸããã»ãšãã©ã«ã¬ãŒãšåãã§ããããã€ããªããã ãããšãã£ãŠè¯ãçµæãšã¯èšããªãããã§ãã
äœèèª çä¹³
ããã¯ç§ãçšæããããŒã¿ããã¡ã§ãããåãããŒã¿ãäºä»¶å ¥ã£ãŠããã®ã§ãå·®ãåºãŠããªãæãã§ããã
ã¡ãªã¿ã«ãããçšã®äœèèªé£åããããŸãããçä¹³ã§ã¯ãªãã§ãããããããããããªãã£ãã®ã¯è¯ãã£ãã§ãã
ã¢ãµãã¹ãŒããŒãã©ã€
ã¡ãã£ãšåŸ®åŠãªãšããã¯ãããŸãããããã ãã€ããªããæ€çŽ¢ãäžçªè¯ãçµæãããããŸããã
å®å šãªæåæ€çŽ¢ãªã®ã§BM25 (default) ã®çµæãäžçªè¯ããšããããšãèããããŸãããã¯ãªã¢ã¢ãµãã«ããããšæããŠãŒã¶ãŒããããšã¯æããŸãããŸããã¢ãµãã¹ãŒããŒãã©ã€ãšãããªãããã®ãã³ã¢ã«ãæ±ããŠãã人ãããããªã®ã§ããã€ããªããã®çµæãäžçªè¯ããšèšãããã§ãã
æ€çŽ¢ã«ãããŠã¯ã©ããŸã§ãã€ãºãæžããããã倧äºã ãšã¯æããŸãããç§ã¯ãã®çµæã¯ãã€ãºã§ã¯ãªãããã«æããŸãã
ã¡ãªã¿ã«ããªã³ããšãã¹ãããŒã¿ãšããŠã¯ããã®ã§ããããããã¯kNNã§åºãŠããªãã§ããã
ããã
äž»ã«ãã¹ãããŒã¿ã®åè³ªãæªãããã®åŸã®å®éšçµæãçç ®ãã®ãããªæãã«ãªã£ãŠããŸããŸããã
ãããããã€ããªããæ€çŽ¢èªäœã¯è©ŠããŸããããäžéšè¯ãæãã®çµæãåºãã®ã§æºè¶³ã¯ããŠããŸãã
ãã®ã»ããä»åã®èª¿æ»éçšã§TF-IDFã§æ¢ãŸã£ãŠããç¥èãå°ãã¢ããããŒã(BM25)ã§ããŸããããRRFãªã©ãç¥ãããšãã§ããŠè¯ãã£ãã§ãã
è¿å¹Žã¯çæAIã®æ³šç®ã«ãã£ãŠRAGå«ããæ€çŽ¢æè¡ã«ã泚ç®ãéãŸã£ãŠããããã«æããŸããä»åŸãéèŠãªæè¡ã ãšã¯æãã®ã§ã宿çã«è¿œããããŠããããã§ãã
ãªãä»åã¯å人ãElasticsearchã䜿ã£ãŠããã®ã§ç§ãããã䜿ããŸããããPostgreSQLã§ãåãããšãã§ããŸãã
以äžã®ki2kaããã®èšäºãªã©ããåç §ãã ããã

çŸåšãããã«ãªã§ã¯ITãšã³ãžãã¢ãåéããŠããŸããèå³ã®ããæ¹ã¯ãã²ãæ°è»œã«ãé£çµ¡ãã ããïŒ