Head
2026-06-16 08:20:55 [scrapy.utils.log] INFO: Scrapy 2.14.1 started (bot: event_scrapers)
2026-06-16 08:20:55 [scrapy.utils.log] INFO: Versions:
{'lxml': '6.0.2',
'libxml2': '2.14.6',
'cssselect': '1.3.0',
'parsel': '1.10.0',
'w3lib': '2.0.0',
'Twisted': '25.5.0',
'Python': '3.12.3 (main, Mar 23 2026, 19:04:32) [GCC 13.3.0]',
'pyOpenSSL': '25.3.0 (OpenSSL 3.5.4 30 Sep 2025)',
'cryptography': '46.0.3',
'Platform': 'Linux-6.8.0-90-generic-x86_64-with-glibc2.39'}
2026-06-16 08:20:55 [scrapy.crawler] DEBUG: Using AsyncCrawlerProcess
2026-06-16 08:20:55 [asyncio] DEBUG: Using selector: EpollSelector
2026-06-16 08:20:55 [scrapy.addons] INFO: Enabled addons:
[]
2026-06-16 08:20:55 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor
2026-06-16 08:20:55 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop
2026-06-16 08:20:56 [scrapy.extensions.telnet] INFO: Telnet Password: 854985660a1082bf
2026-06-16 08:20:56 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.logcount.LogCount',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats']
2026-06-16 08:20:56 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'event_scrapers',
'FEED_EXPORT_ENCODING': 'utf-8',
'FEED_URI_PARAMS': <function _feed_uri_params at 0x769308a3c540>,
'LOG_FILE': '/root/event-list-scraping/logs/event_scrapers/goose_darien/839b6d8c694b11f1a2ae0050565fa5d9.log',
'NEWSPIDER_MODULE': 'event_scrapers.spiders',
'REQUEST_FINGERPRINTER_CLASS': 'scrapy_zyte_api.ScrapyZyteAPIRequestFingerprinter',
'SPIDER_MODULES': ['event_scrapers.spiders']}
2026-06-16 08:20:56 [scrapy_zyte_api.handler] INFO: Using a Zyte API key starting with 'ff9baec'
2026-06-16 08:20:56 [scrapy_zyte_api.handler] INFO: Using a Zyte API key starting with 'ff9baec'
2026-06-16 08:20:56 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.offsite.OffsiteMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2026-06-16 08:20:56 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.start.StartSpiderMiddleware',
'scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy_zyte_api.ScrapyZyteAPISpiderMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware',
'scrapy_zyte_api.ScrapyZyteAPIRefererSpiderMiddleware']
2026-06-16 08:20:56 [scrapy.middleware] INFO: Enabled item pipelines:
['event_scrapers.pipelines.EventScrapersPipeline']
2026-06-16 08:20:56 [py.warnings] WARNING: /root/.venv/lib/python3.12/site-packages/scrapy/pipelines/__init__.py:47: ScrapyDeprecationWarning: EventScrapersPipeline.process_item() requires a spider argument, this is deprecated and the argument will not be passed in future Scrapy versions. If you need to access the spider instance you can save the crawler instance passed to from_crawler() and use its spider attribute.
self._check_mw_method_spider_arg(pipe.process_item)
2026-06-16 08:20:56 [scrapy.core.engine] INFO: Spider opened
2026-06-16 08:20:56 [py.warnings] WARNING: /root/.venv/lib/python3.12/site-packages/scrapy/core/spidermw.py:490: ScrapyDeprecationWarning: event_scrapers.spiders.goose_darien.GooseDarienSpider defines the deprecated start_requests() method. start_requests() has been deprecated in favor of a new method, start(), to support asynchronous code execution. start_requests() will stop being called in a future version of Scrapy. If you use Scrapy 2.13 or higher only, replace start_requests() with start(); note that start() is a coroutine (async def). If you need to maintain compatibility with lower Scrapy versions, when overriding start_requests() in a spider class, override start() as well; you can use super() to reuse the inherited start() implementation without copy-pasting. See the release notes of Scrapy 2.13 for details: https://docs.scrapy.org/en/2.13/news.html
warn(
2026-06-16 08:20:56 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2026-06-16 08:20:56 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2026-06-16 08:21:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://goosedarien.com/live-music/> (referer: https://goosedarien.com/live-music/)
2026-06-16 08:21:05 [py.warnings] WARNING: /root/event-list-scraping/event_scrapers/spiders/goose_darien.py:43: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 43 of the file /root/event-list-scraping/event_scrapers/spiders/goose_darien.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.
soup = bs(fulltext)
2026-06-16 08:21:05 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): 144.91.120.141:80
2026-06-16 08:21:06 [urllib3.connectionpool] DEBUG: http://144.91.120.141:80 "POST /api/v1/raw-events/ HTTP/1.1" 500 145
2026-06-16 08:21:06 [goose_darien] ERROR: API error 500:
<!doctype html>
<html lang="en">
<head>
<title>Server Error (500)</title>
</head>
<body>
<h1>Server Error (500)</h1><p></p>
</body>
</html>
2026-06-16 08:21:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://goosedarien.com/live-music/>
{'event_url': 'https://goosedarien.com/live-music/#04_–_FAKE_ID',
'platform_hash': '808d6bdaf02dec2d508262b1ce6d1105',
'raw_body': '<html><body><li class="p1">\n'
'<p class="p1"><strong>04 – FAKE ID</strong><br/>9:00 PM – 12:00 '
'AM<br/>Rock/ Blues & Soul Dance Band</p>\n'
'</li><div class="et_pb_section et_pb_section_1 '
'et_pb_with_background et_section_regular">\n'
'<div class="et_pb_row et_pb_row_0">\n'
'<div class="et_pb_column et_pb_column_1_2 et_pb_column_0 '
'et_pb_css_mix_blend_mode_passthrough">\n'
Tail
2026-06-16 08:21:06 [urllib3.connectionpool] DEBUG: http://144.91.120.141:80 "POST /api/v1/raw-events/ HTTP/1.1" 400 68
2026-06-16 08:21:06 [goose_darien] ERROR: API error 400: {"event_url":["Raw Event Data with this event url already exists."]}
2026-06-16 08:21:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://goosedarien.com/live-music/>
{'event_url': 'https://goosedarien.com/live-music/#18_–_ZULLY_&_THE_OG’S',
'platform_hash': '808d6bdaf02dec2d508262b1ce6d1105',
'raw_body': '<html><body><li class="p1"><strong>18 – ZULLY & THE '
'OG’S<br/></strong>9:00 PM – 12:00 AM<br/>Soul/ Pop/ Funk & '
'Dance<br/><em></em></li><div class="et_pb_section '
'et_pb_section_1 et_pb_with_background et_section_regular">\n'
'<div class="et_pb_row et_pb_row_0">\n'
'<div class="et_pb_column et_pb_column_1_2 et_pb_column_0 '
'et_pb_css_mix_blend_mode_passthrough">\n'
'<div class="et_pb_module et_pb_text et_pb_text_0 et_animated '
'et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><h3><span>Live MUSIC '
'2026</span></h3></div>\n'
'</div><div class="et_pb_module et_pb_divider_0 et_pb_space '
'et_pb_divider_hidden"><div '
'class="et_pb_divider_internal"></div></div><div '
'class="et_pb_module et_pb_text et_pb_text_1 '
'et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><h2 '
'class="tst-mb-30"><span>SATURDAYS<br/>9pm '
'-12am</span></h2></div>\n'
'</div><div class="et_pb_module et_pb_text et_pb_text_2 '
'et_animated et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><p>Get ready to ignite your '
'Saturday night at <em>The Goose</em>! 🎶✨ Join us for an '
'unforgettable evening as our stage comes alive with sensational '
'live music, featuring an incredible lineup of talented '
'performers. Whether you’re a die-hard music lover or just '
'looking for a great time, this is your ticket to an electrifying '
'experience you won’t want to miss! 🎤🥁🎸</p>\n'
'<p>\xa0</p>\n'
'<p>\xa0</p>\n'
'<h3><span style="color: #1f1f1f;"><em><strong>2026 – '
'April</strong></em></span><span style="color: '
'#1f1f1f;"><em><strong></strong></em></span></h3>\n'
'</div>\n'
'</div>\n'
'</div><div class="et_pb_column et_pb_column_1_2 et_pb_column_1 '
'et_pb_css_mix_blend_mode_passthrough et-last-child">\n'
'<div class="et_pb_module et_pb_divider_1 et_pb_space '
'et_pb_divider_hidden"><div '
'class="et_pb_divider_internal"></div></div><div '
'class="et_pb_module et_pb_image et_pb_image_0 et_animated '
'et-waypoint">\n'
'<span class="et_pb_image_wrap"><img alt="Goose Live Music March '
'2026." class="wp-image-5178" decoding="async" '
'fetchpriority="high" height="1350" sizes="(min-width: 0px) and '
'(max-width: 480px) 480px, (min-width: 481px) and (max-width: '
'980px) 980px, (min-width: 981px) 1080px, 100vw" '
'src="https://goosedarien.com/wp-content/uploads/2026/04/April-Music.webp" '
'srcset="https://goosedarien.com/wp-content/uploads/2026/04/April-Music.webp '
'1080w, '
'https://goosedarien.com/wp-content/uploads/2026/04/April-Music-980x1225.webp '
'980w, '
'https://goosedarien.com/wp-content/uploads/2026/04/April-Music-480x600.webp '
'480w" title="April Music" width="1080"/></span>\n'
'</div><ul class="et_pb_module et_pb_social_media_follow '
'et_pb_social_media_follow_0 clearfix et_pb_bg_layout_light">\n'
'<li class="et_pb_social_media_follow_network_0 et_pb_social_icon '
'et_pb_social_network_link et-social-facebook"><a class="icon '
'et_pb_with_border" '
'href="https://www.facebook.com/TheGooseinDarien/" '
'target="_blank" title="Follow on Facebook"><span '
'aria-hidden="true" '
'class="et_pb_social_media_follow_network_name">Follow</span></a></li><li '
'class="et_pb_social_media_follow_network_1 et_pb_social_icon '
'et_pb_social_network_link et-social-instagram"><a class="icon '
'et_pb_with_border" '
'href="https://www.instagram.com/thegoosedarien/" target="_blank" '
'title="Follow on Instagram"><span aria-hidden="true" '
'class="et_pb_social_media_follow_network_name">Follow</span></a></li>\n'
'</ul>\n'
'</div>\n'
'</div>\n'
'</div><div class="et_pb_module et_pb_text et_pb_text_0_tb_footer '
'et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><h4 style="text-align: '
'center;">DARIEN</h4>\n'
'<p style="text-align: center;"><span>972 Boston Post '
'Rd,<br/>Darien, CT</span></p></div>\n'
'</div></body></html>'}
2026-06-16 08:21:06 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): 144.91.120.141:80
2026-06-16 08:21:06 [urllib3.connectionpool] DEBUG: http://144.91.120.141:80 "POST /api/v1/raw-events/ HTTP/1.1" 400 68
2026-06-16 08:21:06 [goose_darien] ERROR: API error 400: {"event_url":["Raw Event Data with this event url already exists."]}
2026-06-16 08:21:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://goosedarien.com/live-music/>
{'event_url': 'https://goosedarien.com/live-music/#25_–_SHELL_SHOCK’T',
'platform_hash': '808d6bdaf02dec2d508262b1ce6d1105',
'raw_body': '<html><body><li class="p1"><strong>25 – SHELL '
'SHOCK’T<br/></strong>9:00 PM – 12:00 AM<br/><em>Classic Modern '
'Rock & Dance</em></li><div class="et_pb_section '
'et_pb_section_1 et_pb_with_background et_section_regular">\n'
'<div class="et_pb_row et_pb_row_0">\n'
'<div class="et_pb_column et_pb_column_1_2 et_pb_column_0 '
'et_pb_css_mix_blend_mode_passthrough">\n'
'<div class="et_pb_module et_pb_text et_pb_text_0 et_animated '
'et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><h3><span>Live MUSIC '
'2026</span></h3></div>\n'
'</div><div class="et_pb_module et_pb_divider_0 et_pb_space '
'et_pb_divider_hidden"><div '
'class="et_pb_divider_internal"></div></div><div '
'class="et_pb_module et_pb_text et_pb_text_1 '
'et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><h2 '
'class="tst-mb-30"><span>SATURDAYS<br/>9pm '
'-12am</span></h2></div>\n'
'</div><div class="et_pb_module et_pb_text et_pb_text_2 '
'et_animated et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><p>Get ready to ignite your '
'Saturday night at <em>The Goose</em>! 🎶✨ Join us for an '
'unforgettable evening as our stage comes alive with sensational '
'live music, featuring an incredible lineup of talented '
'performers. Whether you’re a die-hard music lover or just '
'looking for a great time, this is your ticket to an electrifying '
'experience you won’t want to miss! 🎤🥁🎸</p>\n'
'<p>\xa0</p>\n'
'<p>\xa0</p>\n'
'<h3><span style="color: #1f1f1f;"><em><strong>2026 – '
'April</strong></em></span><span style="color: '
'#1f1f1f;"><em><strong></strong></em></span></h3>\n'
'</div>\n'
'</div>\n'
'</div><div class="et_pb_column et_pb_column_1_2 et_pb_column_1 '
'et_pb_css_mix_blend_mode_passthrough et-last-child">\n'
'<div class="et_pb_module et_pb_divider_1 et_pb_space '
'et_pb_divider_hidden"><div '
'class="et_pb_divider_internal"></div></div><div '
'class="et_pb_module et_pb_image et_pb_image_0 et_animated '
'et-waypoint">\n'
'<span class="et_pb_image_wrap"><img alt="Goose Live Music March '
'2026." class="wp-image-5178" decoding="async" '
'fetchpriority="high" height="1350" sizes="(min-width: 0px) and '
'(max-width: 480px) 480px, (min-width: 481px) and (max-width: '
'980px) 980px, (min-width: 981px) 1080px, 100vw" '
'src="https://goosedarien.com/wp-content/uploads/2026/04/April-Music.webp" '
'srcset="https://goosedarien.com/wp-content/uploads/2026/04/April-Music.webp '
'1080w, '
'https://goosedarien.com/wp-content/uploads/2026/04/April-Music-980x1225.webp '
'980w, '
'https://goosedarien.com/wp-content/uploads/2026/04/April-Music-480x600.webp '
'480w" title="April Music" width="1080"/></span>\n'
'</div><ul class="et_pb_module et_pb_social_media_follow '
'et_pb_social_media_follow_0 clearfix et_pb_bg_layout_light">\n'
'<li class="et_pb_social_media_follow_network_0 et_pb_social_icon '
'et_pb_social_network_link et-social-facebook"><a class="icon '
'et_pb_with_border" '
'href="https://www.facebook.com/TheGooseinDarien/" '
'target="_blank" title="Follow on Facebook"><span '
'aria-hidden="true" '
'class="et_pb_social_media_follow_network_name">Follow</span></a></li><li '
'class="et_pb_social_media_follow_network_1 et_pb_social_icon '
'et_pb_social_network_link et-social-instagram"><a class="icon '
'et_pb_with_border" '
'href="https://www.instagram.com/thegoosedarien/" target="_blank" '
'title="Follow on Instagram"><span aria-hidden="true" '
'class="et_pb_social_media_follow_network_name">Follow</span></a></li>\n'
'</ul>\n'
'</div>\n'
'</div>\n'
'</div><div class="et_pb_module et_pb_text et_pb_text_0_tb_footer '
'et_pb_text_align_left et_pb_bg_layout_light">\n'
'<div class="et_pb_text_inner"><h4 style="text-align: '
'center;">DARIEN</h4>\n'
'<p style="text-align: center;"><span>972 Boston Post '
'Rd,<br/>Darien, CT</span></p></div>\n'
'</div></body></html>'}
2026-06-16 08:21:06 [scrapy.core.engine] INFO: Closing spider (finished)
2026-06-16 08:21:06 [scrapy.extensions.feedexport] INFO: Stored csv feed (4 items) in: output/2026/06/16/goose_darien.csv
2026-06-16 08:21:06 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 749,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 44916,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'elapsed_time_seconds': 10.082607,
'feedexport/success_count/FileFeedStorage': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2026, 6, 16, 6, 21, 6, 284594, tzinfo=datetime.timezone.utc),
'httpcompression/response_bytes': 179775,
'httpcompression/response_count': 1,
'item_scraped_count': 4,
'items_per_minute': 24.0,
'log_count/DEBUG': 13,
'log_count/ERROR': 4,
'log_count/INFO': 3,
'log_count/WARNING': 1,
'memusage/max': 92741632,
'memusage/startup': 92741632,
'response_received_count': 1,
'responses_per_minute': 6.0,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2026, 6, 16, 6, 20, 56, 201987, tzinfo=datetime.timezone.utc)}
2026-06-16 08:21:06 [scrapy.core.engine] INFO: Spider closed (finished)