OpenNews/hackdays/insideroutsider: Difference between revisions

Jump to navigation Jump to search
Line 64: Line 64:
'''From Matt MacDonald, NearbyFYI:'''  
'''From Matt MacDonald, NearbyFYI:'''  
* [http://said.nearbyfyi.com/docs/V1/ NearbyFYI - local government documents]
* [http://said.nearbyfyi.com/docs/V1/ NearbyFYI - local government documents]
* [http://www.nearbyfyi.com Search interface for the documents we are collecting]
What would you do with 100,000+ documents and extracted text from 170 city and town municipalities in Vermont? We collect city and town documents from select board meeting minutes, planning and zoning committees and other local government legislation. These are often published as PDFs and difficult to scrape HTML. We classify, extract entities [People, Companies, Locations], terms and make them searchable. This is a corpus of partially structured raw text from hundreds of cities and towns.
What would you do with 100,000+ documents and extracted text from 170 city and town municipalities in Vermont? We collect city and town documents from select board meeting minutes, planning and zoning committees and other local government legislation. These are often published as PDFs and difficult to scrape HTML. We classify, extract entities [People, Companies, Locations], terms and make them searchable. This is a corpus of partially structured raw text from hundreds of cities and towns.


2

edits

Navigation menu