NER Console Features
Here are some Galileo features to quickly help you find errors in your data
You can always adjust the DEP slider to filter this view and update the Insights.
Galileo automatically identifies whether any of the following errors are present per row:
a. Span Shift: A count of the misaligned spans that have overlapping predicted and gold spans
b. Wrong Tag: A count of aligned predicted and gold spans that primarily have mismatched labels
c. Missed Span: A count of the spans that have gold spans, but no corresponding predicted spans
d. Ghost Span: A count of the spans that have predicted spans, but no corresponding gold spans
Often it is critical to get a high level view of what specific words the model is struggling with most. This NER specific insight lists out the words that are most frequently contained within spans with high DEP scores.
Click on any word to get a filtered view of the high DEP spans containing that word.
Hover over any region to get a list of spans and the corresponding DEP scores in a list.
Click the region to get a detailed view for a particular span that has been clicked.
After every run, you might want to prune your dataset to either
a. Prep it for the next training job
b. Send the dataset for re-labeling
You can think of the 'Edits Cart' as a means to capture all the dataset changes done during the discovery phase (removing/re-labeling rows and spans) to collectively take action upon a curated dataset.
At any point you can export the dataset to a CSV file in a easy to view format.