Kangaroo LLM Begins Web Crawl for Australia 1st Open AI Model
The Kangaroo LLM project has launched an extensive web crawling initiative to create Australia s first open-source artificial intelligence model. The project s custom web crawler, Kangaroo Bot, will begin collecting data from 754,000 Australian websites starting September 25th, 2024, to build the VegeMighty dataset. This dataset aims to capture a comprehensive corpus of Australian English content, reflecting the country s language and culture. With over 4.2 million registered domains in Australia, this initiative represents a significant step towards developing an AI model that understands and represents Australian language and culture. The project emphasizes ethical data collection, transparency, and data sovereignty, ensuring compliance with national regulations. Vinod Bijlani, AI Practice Leader at Hewlett Packard Enterprise (HPE) and a key partner in the Kangaroo LLM consortium, highlights the importance of this initiative for Australia s AI journey. The consortium, which includes industry leaders such as Katonic, RackCorp, NextDC, Hitachi Vantara, and HPE, views this effort as a crucial step towards establishing Australia as a leader in ethical AI development. Website owners who wish to opt out of the Kangaroo Bot crawl can do so by adding the following to their robots.txt file: User-agent: Kangaroo Bot Disallow: / The Kangaroo LLM project invites all Australians to participate in this groundbreaking journey, either by allowing their sites to be included in the dataset or by following their progress. This initiative aims to build a foundation for Australia s AI future, capturing the essence of Australian online communication and culture.
Source: miragenews.comPublished on 2024-09-19
Related news
- CDAO Unveils Strategy to Boost Data , Analytics , AI Capacity
- Beam Locked In As Hobart New Sole E - scooter Operator
- Mountain pine beetle in steep decline since 2019 peak
- Navigating the Netflix Data Deluge : The Imperative of Effective Data Management
- Is BA . 5 causing more Covid reinfections ? - INDIA New England News
- 3 148 companies operate in Latvia with completely foreign capital
- NASA Opens 2024 Space Apps Challenge Registration
- HK police arrest second man over abnormal push notifications from TVB news app
- Digital first – and last
- Why Microsoft is combining all its data analytics products into Fabric
- Google rivals join forces in online maps
- Corsair Takes Delivery Of Its First A330neo | Aero - News Network
- NSF , Simons Foundation Launch 2 AI Institutes for Astronomy
- Poisoned data
- Awesense and vadimUS Launch Unique Platform to Address Nanogrid Challenges in the Electrical Grid