Hi I am creating a Topology using apache-storm in which my Spout is collecting data from Kakfa Topic and sending it to a bolt.

I am doing some validation over the tuple and emitting stream again for other bolt.

Now the issue is that my second bolt which is using stream of the first bolt has a overload method prepare(Map<String, Object> map, TopologyContext topologyContext, OutputCollector outputCollector) which is executing after let say every 2 seconds.

Code for topology is

topologyBuilder.setBolt("abc",new ValidationBolt()).shuffleGrouping(configurations.SPOUT_ID);  topologyBuilder.setBolt("TEST",new TestBolt()).shuffleGrouping("abc",Utils.VALIDATED_STREAM); 

Code for First bolt "abc" is

@Override     public void execute(Tuple tuple) {         String document = String.valueOf(tuple.getValue(4));         if (Utils.isJSONValid(document)) {             outputCollector.emit(Utils.VALIDATED_STREAM,new Values(document));         }     }       @Override     public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {         outputFieldsDeclarer.declareStream(Utils.VALIDATED_STREAM,new Fields("document"));     } 

While I was searching I found

The prepare method is called when the bolt is initialised and is  similar to the open method in spout. It is called only once for the bolt. It gets the configuration for the bolt and also the context of the bolt.  The collector is used to emit or output the tuples from this bolt.  

Link to public gist for log Storm topology log

The best answer:

Your log shows you are using LocalCluster. It is a testing/demo tool, don't use it for production workloads. Instead set up a real distributed cluster.

Regarding what is happening:

When you run topologies in a LocalCluster, Storm simulates a real cluster by just running all the components (Nimbus, Supervisors and workers) as threads in a single JVM. Your log shows these lines:

20:14:12.451 [SLOT_1027] INFO o.a.s.ProcessSimulator - Begin killing process 2ea97301-24c9-4c1a-bcba-61008693971a

20:14:12.451 [SLOT_1027] INFO o.a.s.d.w.Worker - Shutting down worker smart-transactional-data-1-1566571315 72bbf510-c342-4385-9599-0821a2dee94e 1027

20:14:15.518 [SLOT_1027] INFO o.a.s.d.s.Slot - STATE running msInState: 33328 topo:smart-transactional-data-1-1566571315 worker:2ea97301-24c9-4c1a-bcba-61008693971a -> kill-blob-update msInState: 3001 topo:smart-transactional-data-1-1566571315 worker:2ea97301-24c9-4c1a-bcba-61008693971a

20:14:15.540 [SLOT_1027] INFO o.a.s.d.w.Worker - Launching worker for smart-transactional-data-1-1566571315

The LocalCluster is shutting down one of the simulated workers, because one of the blobs (e.g. topology jar, topology configuration, other types of shared files, see more at in the blobstore changed. Normally when this happens in a real cluster, the worker JVM will be killed, the blob will be updated and the worker will restart. Since you are using LocalCluster, it just kills the worker thread and restarts it. This is why you are seeing multiple invocations of prepare.

