Index
papertrail
papertrail - retrieve author publications and compute bibliometric metrics.
Example
from papertrail import AuthorProfile profile = AuthorProfile("Marie Curie").fetch() metrics = profile.metrics() print(metrics.h_index)
AuthorProfile
Retrieve and analyse all publications of a single author.
AuthorProfile is the primary API surface of papertrail. It combines
a :class:~papertrail.fetchers.base.BaseFetcher for data retrieval with
optional :class:~papertrail.metrics.impact_factor.ImpactFactorDatabase
enrichment and a set of export helpers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Author name (full name or last-name prefix). |
required |
fetcher
|
BaseFetcher | None
|
Custom fetcher instance. Defaults to
:class: |
None
|
email
|
str | None
|
E-mail passed to the default OpenAlex fetcher to enable the polite pool. Ignored when fetcher is provided explicitly. |
None
|
Example
profile = AuthorProfile("Marie Curie").fetch() m = profile.metrics() print(m.h_index)
Source code in src/papertrail/author.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 | |
publications
property
All retrieved publications (empty until :meth:fetch is called).
author_info
property
Resolved author metadata, or None if not yet fetched.
use_impact_factor_database
Attach a custom impact factor database.
If publications have already been fetched, they are enriched
immediately. Otherwise, enrichment happens automatically during
:meth:fetch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db
|
ImpactFactorDatabase
|
A pre-loaded
:class: |
required |
Returns:
| Type | Description |
|---|---|
AuthorProfile
|
|
Source code in src/papertrail/author.py
search_candidates
Return candidates matching :attr:name without fetching publications.
Useful for disambiguating common names before committing to a specific author ID.
Returns:
| Type | Description |
|---|---|
list[AuthorInfo]
|
A list of :class: |
Raises:
| Type | Description |
|---|---|
FetchError
|
If the API request fails. |
Source code in src/papertrail/author.py
fetch
Fetch publications for this author.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
author_id
|
str | None
|
Explicit author identifier (e.g. OpenAlex author ID
URL). When |
None
|
max_results
|
int | None
|
Cap the number of returned publications. |
None
|
Returns:
| Type | Description |
|---|---|
AuthorProfile
|
|
Raises:
| Type | Description |
|---|---|
AuthorNotFoundError
|
If no author matches the name. |
MultipleAuthorsFoundError
|
Raised only when author_id is |
FetchError
|
If an API request fails. |
Example
profile = AuthorProfile("Ada Lovelace").fetch() len(profile.publications) > 0 True
Source code in src/papertrail/author.py
metrics
Compute bibliometric metrics from the fetched publications.
Returns:
| Name | Type | Description |
|---|---|---|
An |
AuthorMetrics
|
class: |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If :meth: |
Source code in src/papertrail/author.py
export_bibtex
Export publications to a BibTeX .bib file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Destination file path. |
required |
Raises:
| Type | Description |
|---|---|
ExportError
|
If the file cannot be written. |
Source code in src/papertrail/author.py
export_publications
Export the publication list to a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Destination file path. |
required |
fmt
|
ExportFormat
|
Output format - |
'json'
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If fmt is not supported. |
ExportError
|
If the file cannot be written. |
Source code in src/papertrail/author.py
export_metrics
Compute and export metrics to a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Destination file path. |
required |
fmt
|
ExportFormat
|
Output format - |
'json'
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If fmt is not supported. |
ExportError
|
If the file cannot be written. |
Source code in src/papertrail/author.py
dashboard
Build the default interactive Bokeh dashboard for this profile.
Returns:
| Type | Description |
|---|---|
object
|
A Bokeh layout containing the available plots. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If :meth: |
Source code in src/papertrail/author.py
export_dashboard
Export the default interactive dashboard.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Destination file path. |
required |
fmt
|
PlotFormat
|
Output format - |
'html'
|
Source code in src/papertrail/author.py
AuthorNotFoundError
Bases: PapertrailError
Raised when no author matches the given name or ID.
ExportError
Bases: PapertrailError
Raised when exporting data to a file fails.
FetchError
Bases: PapertrailError
Raised when an external API request fails.
ADSFetcher
Bases: BaseFetcher
Fetcher backed by the NASA ADS Search API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
token
|
str | None
|
ADS API token. If omitted, reads |
None
|
Raises:
| Type | Description |
|---|---|
FetchError
|
If no token is available. |
Source code in src/papertrail/fetchers/ads.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | |
search_authors
Return a single ADS candidate derived from the provided name.
ADS does not provide a dedicated author-entity endpoint equivalent to OpenAlex author search for this package workflow. We therefore return a single candidate using the original input string as author query.
Source code in src/papertrail/fetchers/ads.py
fetch_publications
Fetch ADS publications for an author query string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
author_id
|
str
|
ADS author query string (e.g. |
required |
max_results
|
int | None
|
Optional cap on returned publications. |
None
|
Returns:
| Type | Description |
|---|---|
list[Publication]
|
List of parsed publications. |
Source code in src/papertrail/fetchers/ads.py
fetch_analyze_metrics
Fetch ADS native analyze metrics for the fetched publication set.
Uses ADS Metrics API to retrieve indicator and time-series payloads when bibcodes are available.
Source code in src/papertrail/fetchers/ads.py
ImpactFactorDatabase
In-memory store of journal impact factors indexed by ISSN and year.
Example
from pathlib import Path db = ImpactFactorDatabase() db.load_csv(Path("jif_data.csv")) enriched = db.enrich_publications(publications)
Source code in src/papertrail/metrics/impact_factor.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 | |
load_csv
Load impact factors from a CSV file.
The file must contain at minimum the columns issn, year, and
impact_factor. Additional columns are silently ignored.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to the CSV file. |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If path does not exist. |
KeyError
|
If a required column is missing. |
ValueError
|
If a numeric field cannot be parsed. |
Source code in src/papertrail/metrics/impact_factor.py
load_json
Load impact factors from a JSON file.
The file must be a JSON object mapping ISSN strings to objects that map year strings (or integers) to float values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to the JSON file. |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If path does not exist. |
JSONDecodeError
|
If the file is not valid JSON. |
ValueError
|
If a numeric field cannot be parsed. |
Source code in src/papertrail/metrics/impact_factor.py
get_impact_factor
Return the impact factor for a journal in a given year.
If an exact match is not found, values within ±tolerance years
are checked in order of proximity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
issn
|
str
|
ISSN string (e.g. |
required |
year
|
int
|
Target year. |
required |
tolerance
|
int
|
How many years to search around year when an exact
match is unavailable. Defaults to |
1
|
Returns:
| Type | Description |
|---|---|
float | None
|
The impact factor as a float, or |
Source code in src/papertrail/metrics/impact_factor.py
enrich_publications
Return a copy of publications enriched with IF data from this database.
For each publication that has a journal with at least one ISSN, an IF
lookup is performed. If a value is found, the publication's
:attr:~papertrail.models.JournalInfo.impact_factor and
:attr:~papertrail.models.JournalInfo.impact_factor_year fields are
updated.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
publications
|
list[Publication]
|
Original list of publications. |
required |
tolerance
|
int
|
Year tolerance passed to :meth: |
1
|
Returns:
| Type | Description |
|---|---|
list[Publication]
|
A new list of :class: |
list[Publication]
|
Publications without journal data are returned unchanged. |
Source code in src/papertrail/metrics/impact_factor.py
AuthorInfo
Bases: BaseModel
Identifies a single author on a publication.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str | None
|
Unique identifier (e.g. OpenAlex author ID URL). |
name |
str
|
Full display name. |
orcid |
str | None
|
ORCID identifier URL, if available. |
affiliations |
list[Affiliation]
|
Institutional affiliations associated with this authorship. |
Source code in src/papertrail/models.py
AuthorMetrics
Bases: BaseModel
Aggregated bibliometric metrics for an author.
Attributes:
| Name | Type | Description |
|---|---|---|
author_name |
str
|
Display name used to retrieve publications. |
openalex_id |
str | None
|
OpenAlex author ID URL, if resolved. |
orcid |
str | None
|
ORCID identifier URL, if available. |
total_publications |
int
|
Total number of retrieved publications. |
total_citations |
int
|
Sum of citation counts across all publications. |
h_index |
int
|
Hirsch index. |
i10_index |
int
|
Number of publications with at least 10 citations. |
average_citations_per_paper |
float
|
Mean citations per publication. |
most_cited_paper_title |
str | None
|
Title of the most-cited publication. |
most_cited_paper_citations |
int
|
Citation count of the most-cited publication. |
publications_per_year |
dict[int, int]
|
Mapping of year -> publication count. |
citations_per_year |
dict[int, int]
|
Mapping of year -> sum of citations for that year's pubs. |
publications_refereed_per_year |
dict[int, int]
|
Mapping of year -> refereed publication count. |
publications_non_refereed_per_year |
dict[int, int]
|
Mapping of year -> non-refereed publication count. |
publications_refereed_normalized_per_year |
dict[int, float]
|
Mapping of year -> refereed publication fraction within that year. |
publications_non_refereed_normalized_per_year |
dict[int, float]
|
Mapping of year -> non-refereed publication fraction within that year. |
citations_refereed_per_year |
dict[int, int]
|
Mapping of year -> citations from refereed publications. |
citations_non_refereed_per_year |
dict[int, int]
|
Mapping of year -> citations from non-refereed publications. |
citations_refereed_normalized_per_year |
dict[int, float]
|
Mapping of year -> refereed citation fraction within that year. |
citations_non_refereed_normalized_per_year |
dict[int, float]
|
Mapping of year -> non-refereed citation fraction within that year. |
index_timeseries_total |
dict[str, dict[int, float]]
|
Mapping of index name -> year -> value. |
index_timeseries_refereed |
dict[str, dict[int, float]]
|
Mapping of index name -> year -> value. |
index_indicators_total |
dict[str, float]
|
Mapping of index name -> snapshot value. |
index_indicators_refereed |
dict[str, float]
|
Mapping of index name -> snapshot value. |
publication_types |
dict[str, int]
|
Mapping of publication type -> publication count. |
journals_per_publication |
dict[str, int]
|
Mapping of journal/venue name -> publication count. |
citation_distribution |
dict[str, int]
|
Mapping of citation bucket -> publication count. |
refereed_publications |
int | None
|
Count of publications marked as refereed. |
non_refereed_publications |
int | None
|
Count of publications marked as non-refereed. |
avg_impact_factor |
float | None
|
Mean impact factor across publications with IF data. |
median_impact_factor |
float | None
|
Median impact factor across publications with IF data. |
Source code in src/papertrail/models.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
JournalInfo
Bases: BaseModel
Journal or venue metadata.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str | None
|
Unique identifier (e.g. OpenAlex source ID URL). |
name |
str
|
Full journal/venue name. |
issn |
list[str]
|
List of ISSN numbers (print and electronic). |
publisher |
str | None
|
Publisher name. |
impact_factor |
float | None
|
Impact factor or proxy metric (e.g. OpenAlex
|
impact_factor_year |
int | None
|
Year the impact factor value corresponds to. |
Source code in src/papertrail/models.py
Publication
Bases: BaseModel
A single scientific publication.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str
|
Unique identifier (e.g. OpenAlex work ID URL). |
title |
str
|
Publication title. |
year |
int
|
Publication year. |
doi |
str | None
|
Digital Object Identifier (without the |
authors |
list[AuthorInfo]
|
Ordered list of authors. |
journal |
JournalInfo | None
|
Journal or venue metadata. |
citation_count |
int
|
Total citations received. |
abstract |
str | None
|
Plain-text abstract, if available. |
type |
str | None
|
Publication type string (e.g. |
refereed |
bool | None
|
Whether this record is marked as refereed by the source, when available (not provided by all data sources). |
open_access |
bool
|
Whether the publication is openly accessible. |
url |
str | None
|
Landing-page URL for the publication. |
Source code in src/papertrail/models.py
build_author_dashboard
Build the initial multi-plot dashboard for an author.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
AuthorMetrics
|
Computed author metrics. |
required |
Returns:
| Type | Description |
|---|---|
object
|
A Bokeh layout containing the available plots. |
Source code in src/papertrail/plots/bokeh_plotter.py
248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 | |
build_citations_per_year_plot
Build an interactive line chart of citations per publication year.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
AuthorMetrics
|
Computed author metrics. |
required |
Returns:
| Type | Description |
|---|---|
object
|
A Bokeh figure. |
Source code in src/papertrail/plots/bokeh_plotter.py
build_publications_per_year_plot
Build an interactive bar chart of publications per year.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
AuthorMetrics
|
Computed author metrics. |
required |
Returns:
| Type | Description |
|---|---|
object
|
A Bokeh figure. |
Source code in src/papertrail/plots/bokeh_plotter.py
build_refereed_breakdown_plot
Build a bar chart comparing refereed and non-refereed publication counts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
AuthorMetrics
|
Computed author metrics. |
required |
Returns:
| Type | Description |
|---|---|
object | None
|
A Bokeh figure when refereed metadata is available, otherwise |
Source code in src/papertrail/plots/bokeh_plotter.py
export_dashboard
Export the default dashboard as HTML or JSON.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
AuthorMetrics
|
Computed author metrics. |
required |
path
|
str | Path
|
Output file path. |
required |
fmt
|
PlotFormat
|
One of |
required |
Raises:
| Type | Description |
|---|---|
ExportError
|
If the output cannot be written. |