标签云

微信群

扫码加入我们

WeChat QR Code

Im trying to use crawlera alongside splash local instance, this is my lua scriptfunction main(splash)function use_crawlera(splash)local user = splash.args.crawlera_userlocal host = 'proxy.crawlera.com'local port = 8010local session_header = 'X-Crawlera-Session'local session_id = 'create'splash:on_request(function(request)request:set_header('X-Crawlera-Cookies', 'disable')request:set_header(session_header, session_id)request:set_proxy { host, port, username = user, password = '' }end)splash:on_response_headers(function(response)if type(response.headers[session_header]) ~= nil thensession_id = response.headers[session_header]endend)endfunction main(splash)use_crawlera(splash)splash:go(splash.args.url)splash:wait(30)return splash:html()endendand this is my start_requestyield SplashRequest(index_url,self.parse_kawanlama_index,endpoint='execute',args={'lua_source': lua_script,'wait' : 5,'html' : 1,'url': index_url,'timeout': 10,'crawlera_user':self.crawlera_apikey},# tell Splash to cache the lua script, to avoid sending it for every requestcache_args=['lua_source'],)but it doesnt seems to work because response.body that i got in self.parse(response) is contains no html.


Did you test your Lua script by itself in the local Splash browser (localhost:8050)?

2019年06月26日43分33秒

malberts yes i did, Im trying to scrape this page kawanlama.com/brands/krisbow, and local splash browser response is Splash Response: ""

2019年06月26日43分33秒