Skip to content

feat(search): add TinyFish as search provider#30492

Closed
simantak-dabhade wants to merge 3 commits into
BerriAI:litellm_oss_staging_150626from
simantak-dabhade:feat/add-tinyfish-search-provider
Closed

feat(search): add TinyFish as search provider#30492
simantak-dabhade wants to merge 3 commits into
BerriAI:litellm_oss_staging_150626from
simantak-dabhade:feat/add-tinyfish-search-provider

Conversation

@simantak-dabhade

@simantak-dabhade simantak-dabhade commented Jun 15, 2026

Copy link
Copy Markdown

Re-opening from #30158 which was approved (LGTM from @Sameerlite) but closed due to merge conflicts when the base branch moved. Rebased onto today's staging branch with conflicts resolved; the TinyFish code itself is unchanged.

What this does

Adds TinyFish (https://tinyfish.ai) as a search provider, following the same GET-based pattern as Brave and Serper. The implementation is self-contained in litellm/llms/tinyfish/ with standard registration hooks.

  • TinyfishSearchConfig extends BaseSearchConfig with GET method, X-API-Key header auth, and the _tinyfish_params wrapper dict pattern for URL query encoding
  • Unified param mappings: country -> location, search_domain_filter -> site: query injection, max_results -> clamped (1-20) and enforced client-side in transform_search_response
  • 7 unit tests (all mocked, no real API calls in CI)
  • Live integration test transcript with real API calls posted in feat(search): add TinyFish as search provider #30158

Files changed (8)

File Change
litellm/llms/tinyfish/search/__init__.py New module export
litellm/llms/tinyfish/search/transformation.py Core TinyfishSearchConfig implementation
litellm/types/utils.py Add TINYFISH to SearchProviders enum
litellm/utils.py Register config in get_provider_search_config()
model_prices_and_context_window.json Add tinyfish/search pricing entry
provider_endpoints_support.json Add tinyfish endpoint metadata
tests/code_coverage_tests/enforce_llms_folder_style.py Add tinyfish to SEARCH_PROVIDERS
tests/search_tests/test_tinyfish_search.py 7 unit tests

Proof of testing

Live integration test output against real TinyFish API (full transcript in #30158):

Test 1: Basic search ("what is litellm proxy") -> 9 results, top hit: github.com/BerriAI/litellm
Test 2: Domain filter (arxiv.org) -> 10 results, all from arxiv.org (Attention Is All You Need, etc.)
Test 3: max_results=2 -> exactly 2 results returned
Test 4: Country="US" -> 10 US-localized results

Prior review history

simantak-dabhade and others added 3 commits June 15, 2026 16:16
Add TinyFish (https://tinyfish.ai) as a new search provider for the
LiteLLM Search API. TinyFish provides web search results via
GET https://api.search.tinyfish.ai with X-API-Key authentication.

Supports unified params (query, country, search_domain_filter) and
TinyFish-specific passthrough params (language, page, include_thumbnail).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…thumbnail type

Address Greptile review feedback:
- Map max_results to query param and truncate results client-side (TinyFish API has no count param)
- Fix include_thumbnail type annotation from str to bool
- Add test for max_results truncation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ch_request

Address Greptile P2 feedback: apply max(1, ...) in transform_search_request
so the API receives the same clamped value that transform_search_response uses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@simantak-dabhade

Copy link
Copy Markdown
Author

Hey @Sameerlite, this is the same PR you approved in #30158 -- just rebased onto today's staging branch to resolve the merge conflicts you flagged. The conflicts were all in shared files (types/utils.py, utils.py, model_prices_and_context_window.json, provider_endpoints_support.json, enforce_llms_folder_style.py) where the new you_com and apiserpent providers landed while our PR was waiting. TinyFish entries now sit alongside those.

The TinyFish code itself is unchanged from what you reviewed. Happy to answer any questions.

@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 42.85714% with 40 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
litellm/llms/tinyfish/search/transformation.py 39.39% 40 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds TinyFish as a new search provider, following the same GET-based pattern as Brave and Serper. The implementation is self-contained under litellm/llms/tinyfish/ with standard registration hooks, enum entry, and pricing metadata.

  • TinyfishSearchConfig maps the unified Perplexity params (country→location, search_domain_filter→site: injection, max_results clamped 1–20) and wraps query params in a _tinyfish_params dict for URL encoding via get_complete_url — identical mechanism to _brave_params.
  • 7 fully-mocked unit tests cover param mapping, domain filter, language passthrough, truncation, empty results, and missing-key errors; no real network calls touch CI.
  • Minor: the _append_domain_filters helper omits the AND keyword between the query and domain groups (Brave uses explicit AND), which could be ambiguous if TinyFish's parser defaults spaces to OR.

Confidence Score: 4/5

Safe to merge — the change is entirely additive, touches no existing provider logic, and all wiring follows the established pattern.

The integration is clean and self-contained. The only open questions are the implicit-AND domain filter (works in live testing, but a future TinyFish parser change could silently break domain-filtered searches) and a cosmetic ambiguity in the response-side max_results floor. Neither affects correctness today.

litellm/llms/tinyfish/search/transformation.py — specifically the _append_domain_filters method and the max_results handling in transform_search_response

Important Files Changed

Filename Overview
litellm/llms/tinyfish/search/transformation.py Core TinyfishSearchConfig implementation — clean GET-based pattern matching Brave/Serper; minor: domain filter omits explicit AND keyword between query and domain clauses (vs Brave's AND), and max_results client-side clamping lower-bound of 1 is applied in response even though user could legitimately expect 0-cap behavior
tests/search_tests/test_tinyfish_search.py 7 fully-mocked unit tests covering basic search, param mapping, domain filter injection, language passthrough, max_results truncation, empty results, and missing API key — no real network calls, consistent with other search provider tests
litellm/types/utils.py Adds TINYFISH to SearchProviders enum — correct, minimal change
litellm/utils.py Registers TinyfishSearchConfig in get_provider_search_config() dispatch map — correct, follows the exact same pattern as all other search providers
model_prices_and_context_window.json Adds tinyfish/search pricing entry with input_cost_per_query=0.0 and subscription pricing note — correct format
provider_endpoints_support.json Adds tinyfish endpoint metadata with search=true; references a docs URL (docs.litellm.ai/docs/search/tinyfish) that presumably does not exist yet
tests/code_coverage_tests/enforce_llms_folder_style.py Adds "tinyfish" to SEARCH_PROVIDERS list for folder-style enforcement — correct registration
litellm/llms/tinyfish/search/init.py Exports TinyfishSearchConfig — correct, minimal boilerplate consistent with other providers

Reviews (1): Last reviewed commit: "fix(search/tinyfish): clamp max_results ..." | Re-trigger Greptile

Comment on lines +115 to +116
domain_clauses = " OR ".join(f"site:{d}" for d in domains)
return f"({query}) ({domain_clauses})"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The domain-filter query uses implicit adjacency (({query}) ({domain_clauses})) rather than an explicit AND, unlike the Brave implementation which produces ({query}) AND ({domain_clauses}). In Boolean query parsers, a bare space can default to OR depending on the engine, which would return results matching either the original query or the domain list rather than the intersection. The live test showed correct behavior, but making the operator explicit removes ambiguity if TinyFish's parser ever changes defaults.

Suggested change
domain_clauses = " OR ".join(f"site:{d}" for d in domains)
return f"({query}) ({domain_clauses})"
domain_clauses = " OR ".join(f"site:{d}" for d in domains)
return f"({query}) AND ({domain_clauses})"

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #30592. Changed to explicit AND: return f"({query}) AND ({domain_clauses})"

Comment on lines +126 to +127
query_params = raw_response.request.url.params if raw_response.request else {}
max_results = max(1, min(int(query_params.get("max_results", 20)), 20))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 max_results lower-bound of 1 silently overrides a max_results=0 request

max(1, min(int(query_params.get("max_results", 20)), 20)) clamps the client-side cap to at least 1 even when max_results=0 was explicitly sent. Because transform_search_request also clamps with max(1, ...), the value 0 can never actually reach this code, but the asymmetry between the request clamp (explicit, clearly intentional) and the response clamp (silent floor of 1 on an unknown default) could confuse future maintainers. Consider either removing the lower-bound here (the loop is safe with max_results=0) or adding a comment explaining why the floor is needed on the response side.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #30592. Removed the max(1, ...) floor from transform_search_response since transform_search_request already clamps max_results to min 1, so 0 can never reach the response path

@Sameerlite

Copy link
Copy Markdown
Collaborator

Thanks for the contribution! A couple of things to address before this is ready for merge:

  • Greptile's code review left 2 unresolved comment(s) that could use your attention — could you take a look and address them?
  • It looks like some CI checks are failing — could you take a look and fix them, or let us know if the failures are unrelated to your change?

Once those are in, we'll take another look!

@mateo-berri mateo-berri deleted the branch BerriAI:litellm_oss_staging_150626 June 16, 2026 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants