How-to: Adjust LLM Parameters¶

This guide provides detailed information on tuning LLM parameters to optimize extraction quality, consistency, and performance.

Parameter Overview¶

LLM parameters control how the model generates text. The most important ones for metadata extraction are:

Parameter	Range	Purpose	Default
`temperature`	0.0-2.0	Controls randomness	0.2
`top_p`	0.0-1.0	Nucleus sampling	0.9
`top_k`	1-100	Limits token choices	40
`num_ctx`	128-32768	Context window size	2048
`num_predict`	-1 or 1-2048	Max tokens to generate	-1 (unlimited)
`repeat_penalty`	0.0-2.0	Penalizes repetition	1.1

Temperature¶

Controls output randomness by scaling the probability distribution over tokens.

How It Works¶

0.0: Deterministic - always picks the most likely token
0.1-0.3: Low randomness - good for factual extraction
0.5-0.7: Balanced - some creativity
0.8-1.0: High randomness - diverse outputs
>1.0: Very random - experimental

Example Comparison¶

With temperature=0.0:

Output: La0.8Sr0.2NiO2
(same every time)

With temperature=0.7:

Run 1: La0.8Sr0.2NiO2
Run 2: La₀.₈Sr₀.₂NiO₂
Run 3: La0.8Sr0.2NiO2, lanthanum strontium nickelate
(variations in format)

top_p (Nucleus Sampling)¶

Limits token selection to the smallest set whose cumulative probability exceeds top_p.

How It Works¶

0.5: Only consider top 50% probability mass
0.9: Consider tokens making up 90% probability (recommended)
0.95: More diverse outputs
1.0: Consider all tokens (disabled)

Interaction with Temperature¶

Low temperature + low top_p = Very focused, deterministic
Low temperature + high top_p = Consistent but considers more options
High temperature + low top_p = Randomly picks from focused set (unstable)
High temperature + high top_p = Maximum diversity

num_ctx (Context Size)¶

Maximum number of tokens the model can process (input + output).

Choosing the Right Size¶

2048 tokens (~1500 words): - Fast processing - Sufficient for 3-5 small chunks - Use for simple extraction

nerxiv prompt \
  --file-path paper.hdf5 \
  --n-top-chunks 3 \
  -llmo num_ctx=2048

4096 tokens (~3000 words):

Standard setting
Good for 5-7 medium chunks
Balance of speed and context

nerxiv prompt \
  --file-path paper.hdf5 \
  --n-top-chunks 5 \
  -llmo num_ctx=4096

8192 tokens (~6000 words):

Large context (recommended for papers)
8-12 chunks
Better understanding of context

nerxiv prompt \
  --file-path paper.hdf5 \
  --n-top-chunks 10 \
  -llmo num_ctx=8192

16384+ tokens:

Very large context
May be slower
Check model support

nerxiv prompt \
  --file-path paper.hdf5 \
  --n-top-chunks 15 \
  -llmo num_ctx=16384

Estimating Token Count¶

Rough estimates:

1 token ≈ 0.75 words (English)
1 token ≈ 4 characters
Your prompt template ≈ 200-500 tokens
Each chunk ≈ 250-750 tokens (depending on chunker settings)

Example calculation:

Prompt: 300 tokens
5 chunks × 500 tokens = 2500 tokens
Output: 200 tokens
Total: ~3000 tokens → use num_ctx=4096

Validating Parameter Effects¶

Test parameter changes systematically:

# Baseline
nerxiv prompt --file-path paper.hdf5 --query material_formula

# Test temperature
nerxiv prompt --file-path paper.hdf5 --query material_formula -llmo temperature=0.0
nerxiv prompt --file-path paper.hdf5 --query material_formula -llmo temperature=0.3

# Test context size
nerxiv prompt --file-path paper.hdf5 --n-top-chunks 5 -llmo num_ctx=4096
nerxiv prompt --file-path paper.hdf5 --n-top-chunks 10 -llmo num_ctx=8192