Changes

Summary

  1. Experimental support for sparse dataframes (commit: 07a7612) (details)
  2. Use a slightly optimized simpler FeatureGenerator per default. (commit: d6494e8) (details)
  3. Make FeatureGenerator a dataclass to avoid redundant __init__. (commit: d2d73dc) (details)
  4. Fixed issue in TableDocumentDescriber (commit: 8d71481) (details)
  5. fix(corpus): parallel parsing (commit: 3a23d14) (details)
  6. fix(corpus): register corpus attributes. (commit: 35b0614) (details)
Commit 07a76122416398e5c422e5ff3da534fb3cb37a63 by Thorsten Vitt
Experimental support for sparse dataframes
(commit: 07a7612)
The file was modified delta/corpus.py (diff)
Commit d6494e8da61e974c15fd63d0f52334abbbdc1f4a by Thorsten Vitt
Use a slightly optimized simpler FeatureGenerator per default.
(commit: d6494e8)
The file was modified delta/corpus.py (diff)
Commit d2d73dcd6c6dbe9a53a47451b2fa1981cc0d6466 by Thorsten Vitt
Make FeatureGenerator a dataclass to avoid redundant __init__.
(commit: d2d73dc)
The file was modified delta/corpus.py (diff)
Commit 8d7148127819342d6360c67d3efe9d2586b4727e by Thorsten Vitt
Fixed issue in TableDocumentDescriber
(commit: 8d71481)
The file was modified delta/util.py (diff)
The file was modified test/corpus_test.py (diff)
The file was modified delta/corpus.py (diff)
Commit 35b061403021005713515a9040488e76886b3b48 by Thorsten Vitt
fix(corpus): register corpus attributes.

When a Corpus would contain a token that is identical to one of the
attribute names (e.g., 'logger'), pandas would store the value assigned to
the attribute in (all rows of) the column of the underlying DataFrame.
(commit: 35b0614)
The file was modified test/corpus_test.py (diff)
The file was modified delta/corpus.py (diff)