What's your actual paper sourcing workflow when PDFs aren't available? #199881

amnaajmal24 · 2026-06-22T18:39:16Z

amnaajmal24
Jun 22, 2026

Discussion Type

Product Feedback

Discussion Content

One thing paper-qa handles really well is the QA layer over documents you already have. What I've been trying to figure out is the step before that, actually getting the papers in the first place, especially for topics that go beyond ArXiv.

For context I was building a corpus covering biomedical and social science literature, neither of which is well represented on ArXiv. PubMed helps but the full text situation is inconsistent. Ended up trying scholarapi.net which pulls full text directly from open access sources across 30M+ papers, no PDF hunting, no broken links. Saved a lot of time on what is honestly the most boring part of the whole pipeline.

Curious what others are doing here. Are you mostly working with papers you already have locally, or have you found a reliable way to source at scale for domains outside the usual ArXiv/CS coverage?

2026-06-22T18:39:57Z

github-actions[bot]
Bot Jun 22, 2026

💬 Your Product Feedback Has Been Submitted 🎉

Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users.

Here's what you can expect moving forward ⏩

Your input will be carefully reviewed and cataloged by members of our product teams.
- Due to the high volume of submissions, we may not always be able to provide individual responses.
- Rest assured, your feedback will help chart our course for product improvements.
Other users may engage with your post, sharing their own perspectives or experiences.
GitHub staff may reach out for further clarification or insight.
- We may 'Answer' your discussion if there is a current solution, workaround, or roadmap/changelog post related to the feedback.

Where to look to see what's shipping 👀

Read the Changelog for real-time updates on the latest GitHub features, enhancements, and calls for feedback.
Explore our Product Roadmap, which details upcoming major releases and initiatives.

What you can do in the meantime 💻

Upvote and comment on other user feedback Discussions that resonate with you.
Add more information at any point! Useful details include: use cases, relevant labels, desired outcomes, and any accompanying screenshots.

As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities.

Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Community

What's your actual paper sourcing workflow when PDFs aren't available? #199881

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GitHub Community

What's your actual paper sourcing workflow when PDFs aren't available? #199881

Uh oh!

amnaajmal24 Jun 22, 2026

Discussion Type

Discussion Content

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 22, 2026

amnaajmal24
Jun 22, 2026

github-actions[bot]
Bot Jun 22, 2026