Text mining a museum collection in tabular format to extract from which year most objects derive and what they are.
Associated Tutorial
This workflows is part of the tutorial OpenRefine Tutorial for researching cultural data, available in the GTN
Features
- Includes [Galaxy Workflow ...
The workflow serves as a short introduction to Galaxy for users from the Humanities who mostly work with texts. The workflow compares two texts, visualises the differences with 'diff' and a wordcloud and extracts selected passaged for further analysis.
Associated Tutorial
This workflows is part of the tutorial Introduction to Digital Humanities in Galaxy, available in ...
This workflow applies text mining to a museum collection in tabular format to extract from which year most objects derive and what they are. The first steps are filtering and data cleaning to put the data in correct format. Datamash allows showing how many documents from what year the museum catalogue contains. The output is a chronological table which is visualised as a bar chart. From that, the year where most items derived from is extracted. The next step filters items only from that year. The ...