Shared Elasticsearch corrompue, comment réinitialiser

Bonjour.

Suite à un problème sur un SAN (dont j’analyse encore la nature), j’ai retrouvé le filsystem hébergeant /var/spool endommagé, j’ai du le réparer avec xfs_repair. Dans l’ensemble, les données bluemind semblent intactes, sauf pour Elasticsearch, qui refuse de démarrer :

[2019-07-26T10:08:20,813][WARN ][o.e.i.e.Engine           ] [eSN6uxf] [mailspool_2][0] failed engine [corrupt file (source: [start])]
org.apache.lucene.index.CorruptIndexException: Problem reading index. (resource=/var/spool/bm-elasticsearch/data/nodes/0/indices/ljtHz4kbRtCx-A95_VuLhQ/0/index/_2dx_Lucene50_0.tim)
        at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:143) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:82) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:172) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:211) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:105) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:523) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:103) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:79) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
        at org.elasticsearch.index.engine.InternalEngine.createSearcherManager(InternalEngine.java:537) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:209) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:160) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:2188) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:2170) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1377) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1332) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:421) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:301) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1603) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$4(IndexShard.java:2055) ~[elasticsearch-6.4.3.jar:6.4.3]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.4.3.jar:6.4.3]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ?:1.8.0_212]
        at java.lang.Thread.run(Thread.java:748) ?:1.8.0_212]

Et donc des erreurs de ce genre dans bm-core

2019-07-26 08:25:43,634 [BM-Core26] n.b.c.r.b.RestServiceMethodHandler ERROR - Error during restcall RestRequest [path=/internal-api/db_message_bodies/bm-master__fws_fr/9fe1ad87ed2378960d6be6cafba330fbab50e569, method=PUT, User-Agent=null
, params=, remoteAddresses=[10.29.3.14], origin=null]:  class net.bluemind.core.api.fault.ServerFault: java.util.concurrent.ExecutionException: UnavailableShardsException[mailspool_pending][4] primary shard is not active Timeout: [1m], r
equest: [BulkShardRequest [[mailspool_pending][4]] containing [index {[mailspool_pending][eml][9fe1ad87ed2378960d6be6cafba330fbab50e569], source{"preview":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK Date: 2019.07.26 à 09:
44:04 Criticité: Disaster Item: Réponse au ping (icmpping,2,,,2000]) Valeur : ","date":"2019-07-26T07:52:07Z","cc":],"headers":{"cc":null,"from":"<zabbix@firewall-services.com>","to":"<daniel@firewall-services.com>"},"subject":"rp.fws.f
r: L'adresse 10.29.2.12 ne répond plus au ping: OK","content":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK","rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK Date: 2019.07.26 à 09:44:04 Criticité: Disaster Item: 
Réponse au ping (icmpping,2,,,2000]) Valeur : 1","zabbix@firewall-services.com","daniel@firewall-services.com"],"subject_kw":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK","with":"zabbix@firewall-services.com","daniel@fire
wall-services.com"],"size":2442,"content-type":"text/plain","from":"zabbix@firewall-services.com"],"to":"daniel@firewall-services.com"],"has":]}]}]]]
        at net.bluemind.backend.mail.replica.service.internal.DbMessageBodiesService.create(DbMessageBodiesService.java:107)
        ... 19 common frames omitted
Caused by:
  class java.util.concurrent.ExecutionException: UnavailableShardsException[mailspool_pending][4] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[mailspool_pending][4]] containing [index {[mailspool_pending][eml][9fe1ad87ed2378960d6be6cafba330fbab50e569], source{"preview":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK Date: 2019.07.26 à 09:44:04 Criticité: Disaster Item: Réponse au ping (icmpping,2,,,2000]) Valeur : ","date":"2019-07-26T07:52:07Z","cc":],"headers":{"cc":null,"from":"<zabbix@firewall-services.com>","to":"<daniel@firewall-services.com>"},"subject":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK","content":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK","rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK Date: 2019.07.26 à 09:44:04 Criticité: Disaster Item: Réponse au ping (icmpping,2,,,2000]) Valeur : 1","zabbix@firewall-services.com","daniel@firewall-services.com"],"subject_kw":"rp.fws.fr: L'adresse 10.29.2.12 ne répond plus au ping: OK","with":"zabbix@firewall-services.com","daniel@firewall-services.com"],"size":2442,"content-type":"text/plain","from":"zabbix@firewall-services.com"],"to":"daniel@firewall-services.com"],"has":]}]}]]]
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
        at net.bluemind.backend.mail.replica.service.internal.DbMessageBodiesService.create(DbMessageBodiesService.java:104)
        ... 19 common frames omitted

Comment je peux faire pour remettre tout ça en ordre ? Comment réinitialiser l’ensemble des indexes elasticsearch et les reconstruire ?

Quelle est votre version de BlueMind ?

4.0.7 (avec souscription)

Personne ?

Est-ce que le service ElasticSearch est démarré ?

Nop, comme indiqué dans le premier message, le service refuse de démarrer à cause des donnes corrompues

Salut,

tu as regardé dans la doc, à cette URL : https://forge.bluemind.net/confluence/display/BM40/Problemes+de+recherche+et+indexation ?

Pascal

OK, j’avais en effet raté cette page. Ça semble OK (en cours de réindexation). Je viendrai crier à l’aide si j’ai d’autre pb. Merci :slight_smile: