function get_bib
Fetches BibTeX citation data for a given DOI (Digital Object Identifier) from the CrossRef API.
/tf/active/vicechatdev/offline_parser_docstore.py
31 - 52
simple
Purpose
This function retrieves bibliographic information in BibTeX format for academic papers and publications using their DOI. It queries the CrossRef API's transformation service to convert DOI metadata into a formatted BibTeX citation string. This is useful for automated citation management, bibliography generation, and academic reference systems.
Source Code
def get_bib(doi):
"""
Parameters
----------
doi: str
Returns
-------
found: bool
bib: str
"""
bare_url = "http://api.crossref.org/"
url = "{}works/{}/transform/application/x-bibtex"
url = url.format(bare_url, doi)
r = requests.get(url)
#found = False if r.status_code != 200 else True
bib = r.content
bib = str(bib, "utf-8")
return bib
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
doi |
- | - | positional_or_keyword |
Parameter Details
doi: A string containing the Digital Object Identifier (DOI) for a publication. Should be in standard DOI format (e.g., '10.1000/xyz123'). The DOI can include or exclude the 'doi:' prefix or 'https://doi.org/' URL prefix as the CrossRef API handles various formats.
Return Value
Returns a string containing the BibTeX formatted citation for the publication. The BibTeX string includes standard fields like author, title, journal, year, etc. Note: The docstring incorrectly mentions returning a tuple (found: bool, bib: str), but the actual implementation only returns the bib string. If the DOI is not found or the request fails, the function may return an error message or empty content from the API.
Dependencies
requests
Required Imports
import requests
Usage Example
import requests
def get_bib(doi):
bare_url = "http://api.crossref.org/"
url = "{}works/{}/transform/application/x-bibtex"
url = url.format(bare_url, doi)
r = requests.get(url)
bib = r.content
bib = str(bib, "utf-8")
return bib
# Example usage
doi = "10.1038/nature12373"
bib_citation = get_bib(doi)
print(bib_citation)
# Output will be a BibTeX formatted string like:
# @article{Author_2013,
# title={Article Title},
# author={Author, First and Author, Second},
# journal={Nature},
# year={2013},
# ...
# }
Best Practices
- Add error handling to check r.status_code before processing the response (the commented-out 'found' variable suggests this was intended)
- Consider adding timeout parameter to requests.get() to prevent hanging on slow connections
- Validate the DOI format before making the API request to avoid unnecessary network calls
- Handle potential exceptions from requests.get() (network errors, timeouts, etc.)
- The function should return both success status and bib content as indicated in the docstring, or update the docstring to match actual behavior
- Consider using HTTPS instead of HTTP for the API URL for better security
- Add retry logic for transient network failures
- Cache results for frequently requested DOIs to reduce API calls
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function get_bibtext 74.7% similar
-
function api_get_extensive_reference 37.7% similar
-
function parse_references_section 37.3% similar
-
function api_get_document 37.2% similar
-
function api_get_reference_document 36.7% similar