共計 1846 個字符,預計需要花費 5 分鐘才能閱讀完成。
這篇文章主要講解了“Storm 怎么寫一個爬蟲”,文中的講解內容簡單清晰,易于學習與理解,下面請大家跟著丸趣 TV 小編的思路慢慢深入,一起來研究和學習“Storm 怎么寫一個爬蟲”吧!
package com.digitalpebble.storm.crawler.bolt.indexing;
import java.util.Map;
import org.slf4j.LoggerFactory;
import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Tuple;
import com.digitalpebble.storm.crawler.StormConfiguration;
import com.digitalpebble.storm.crawler.util.Configuration;
* A generic bolt for indexing documents which determines which endpoint to use
* based on the configuration and delegates the indexing to it.
***/
@SuppressWarnings(serial)
public class IndexerBolt extends BaseRichBolt {
private Configuration config;
private BaseRichBolt endpoint;
private static final org.slf4j.Logger LOG = LoggerFactory
.getLogger(IndexerBolt.class);
public void prepare(Map conf, TopologyContext context,
OutputCollector collector) { config = StormConfiguration.create();
// get the implementation to use
// and instanciate it
String className = config.get( stormcrawler.indexer.class
if (className == null) {
throw new RuntimeException( No configuration found for indexing
}
try { final Class BaseRichBolt implClass = (Class BaseRichBolt) Class
.forName(className);
endpoint = implClass.newInstance();
} catch (final Exception e) { throw new RuntimeException( Couldn t create + className, e);
}
if (endpoint != null)
endpoint.prepare(conf, context, collector);
}
public void execute(Tuple tuple) { if (endpoint != null)
endpoint.execute(tuple);
}
public void declareOutputFields(OutputFieldsDeclarer declarer) { if (endpoint != null)
endpoint.declareOutputFields(declarer);
}
}
感謝各位的閱讀,以上就是“Storm 怎么寫一個爬蟲”的內容了,經過本文的學習后,相信大家對 Storm 怎么寫一個爬蟲這一問題有了更深刻的體會,具體使用情況還需要大家實踐驗證。這里是丸趣 TV,丸趣 TV 小編將為大家推送更多相關知識點的文章,歡迎關注!
正文完