Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug/Local API Error:Input should be 'by_title' #3849

Open
tangkunyin opened this issue Dec 23, 2024 · 1 comment
Open

bug/Local API Error:Input should be 'by_title' #3849

tangkunyin opened this issue Dec 23, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@tangkunyin
Copy link

tangkunyin commented Dec 23, 2024

Describe the bug

Basic chunking strategy not recognized with a markdown file, and it seems only valid with 'by_title'

How did the error occurred

I have checked the official documentation, and found by_page and by_similarity are only available in Unstructured API and Platform.

I used the https://github.com/Unstructured-IO/unstructured-api @0.0.82 source code and built a local api service, everything was ok except the chunkingStrategy with basic

Loader Options

{
  apiKey: '',
  apiUrl: 'http://127.0.0.1:8000/general/v0/general',
  strategy: 'auto',
  chunkingStrategy: 'basic'
}

Response by API

Failed to partition file /tmp/uploads/ec20069fa25bd07537eae8559fde792dff9d944288bc3c2ecbc8785ea98bc5c8.md with error 422 and message {"detail":[{"type":"literal_error","loc":["body","chunking_strategy"],"msg":"Input should be 'by_title'","input":"basic","ctx":{"expected":"'by_title'"}}]}

Expected behavior
No errors and file parsed correctly.

Any feedback is greatful. 🙏 😊 🙌

@tangkunyin tangkunyin added the bug Something isn't working label Dec 23, 2024
@tangkunyin
Copy link
Author

tangkunyin commented Dec 23, 2024

Unfortunately, I found a issue, it said only by_title supported with Self-hosting version.

It would be better upgrading the official document and making a table list that make clarify the difference between the self-hosted version and the public cloud api version.

If not, we won't know where the real problem it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant