Skip to content

Commit

Permalink
feat: Implement SearchXNG as alternative search backend
Browse files Browse the repository at this point in the history
- Add SearchXNG search function in search tool
- Update types for SearchXNG results
- Add configuration options in README for SearchXNG setup
  • Loading branch information
casistack committed Aug 13, 2024
1 parent f68855d commit ddd1669
Show file tree
Hide file tree
Showing 8 changed files with 2,650 additions and 27 deletions.
7 changes: 7 additions & 0 deletions .env.local.example
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@ LOCAL_REDIS_URL=redis://localhost:6379 # or redis://redis:6379 if you're using d
UPSTASH_REDIS_REST_URL=[YOUR_UPSTASH_REDIS_REST_URL]
UPSTASH_REDIS_REST_TOKEN=[YOUR_UPSTASH_REDIS_REST_TOKEN]

SEARCHXNG_API_URL=http://localhost:8080 # Replace with your local SearchXNG API URL or docker http://searchxng:8080
SEARCH_API=searchxng # use searchxng, tavily or exa
SEARXNG_SECRET="" # generate a secret key e.g. openssl rand -base64 32
SEARXNG_PORT=8080 # default port
SEARXNG_BIND_ADDRESS=0.0.0.0 # default address
SEARXNG_IMAGE_PROXY=true # enable image proxy
SEARXNG_LIMITER=false # can be enabled to limit the number of requests per IP address

# Optional
# The settings below can be used optionally as needed.
Expand Down
48 changes: 47 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ An AI-powered search engine with a generative UI.
- App framework: [Next.js](https://nextjs.org/)
- Text streaming / Generative UI: [Vercel AI SDK](https://sdk.vercel.ai/docs)
- Generative Model: [OpenAI](https://openai.com/)
- Search API: [Tavily AI](https://tavily.com/) / [Serper](https://serper.dev)
- Search API: [Tavily AI](https://tavily.com/) / [Serper](https://serper.dev) / [SearchXNG](https://docs.searxng.org/)
- Reader API: [Jina AI](https://jina.ai/)
- Serverless Database: [Upstash](https://upstash.com/)
- Component library: [shadcn/ui](https://ui.shadcn.com/)
Expand Down Expand Up @@ -146,6 +146,52 @@ If you want to use Morphic as a search engine in your browser, follow these step

This will allow you to use Morphic as your default search engine in the browser.

### Using SearchXNG as an Alternative Search Backend

Morphic now supports SearchXNG as an alternative search backend. To use SearchXNG:

1. Ensure you have Docker and Docker Compose installed on your system.
2. In your `.env.local` file, set the following variables:

- SEARCHXNG_API_URL=http://redis:8080
- SEARXNG_SECRET=your_secret_key_here
- SEARXNG_PORT=8080
- SEARXNG_IMAGE_PROXY=true
- SEARCH_API=searchxng
- SEARXNG_LIMITER=false # can be enabled to limit the number of requests per IP
- SEARCH_API=searchxng

3. Two configuration files are provided in the root directory:
- `searxng-settings.yml`: This file contains the main configuration for SearchXNG, including engine settings and server options.
- `searxng-limiter.toml`: This file configures the rate limiting and bot detection features of SearchXNG.

4. Run `docker-compose up` to start the Morphic stack with SearchXNG included.
5. SearchXNG will be available at `http://localhost:8080` and Morphic will use it as the search backend.

#### Customizing SearchXNG

* You can modify `searxng-settings.yml` to enable/disable specific search engines, change UI settings, or adjust server options.
* The `searxng-limiter.toml` file allows you to configure rate limiting and bot detection. This is useful if you're exposing SearchXNG directly to the internet.
* If you prefer not to use external configuration files, you can set these options using environment variables in the `docker-compose.yml` file or directly in the SearchXNG container.

#### Advanced Configuration

* To disable the limiter entirely, set `LIMITER=false` in the SearchXNG service environment variables.
* For production use, consider adjusting the `SEARXNG_SECRET_KEY` to a secure, randomly generated value.
* The `SEARXNG_IMAGE_PROXY` option allows SearchXNG to proxy image results, enhancing privacy. Set to `true` to enable this feature.

#### Troubleshooting

* If you encounter issues with specific search engines (e.g., Wikidata), you can disable them in `searxng-settings.yml`:

```yaml
engines:
- name: wikidata
disabled: true
```
* refer to https://docs.searxng.org/admin/settings/settings.html#settings-yml
## ✅ Verified models
### List of models applicable to all:
Expand Down
Binary file modified bun.lockb
Binary file not shown.
14 changes: 13 additions & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ services:
- "3000:3000" # Maps port 3000 on the host to port 3000 in the container.
depends_on:
- redis
- searchxng

redis:
image: redis:alpine
Expand All @@ -21,5 +22,16 @@ services:
- redis_data:/data
command: redis-server --appendonly yes

searchxng:
image: searxng/searxng
ports:
- "${SEARXNG_PORT:-8080}:8080"
env_file: .env.local # can remove if you want to use env variables or in settings.yml
volumes:
- ./searxng-limiter.toml:/etc/searxng/limiter.toml
- ./searxng-settings.yml:/etc/searxng/settings.yml
- searchxng_data:/data

volumes:
redis_data:
redis_data:
searchxng_data:
136 changes: 111 additions & 25 deletions lib/agents/tools/search.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import { searchSchema } from '@/lib/schema/search'
import { SearchSection } from '@/components/search-section'
import { ToolProps } from '.'
import { sanitizeUrl } from '@/lib/utils'
import { SearchResults } from '@/lib/types'
import { SearchResults, SearchResultItem, SearchXNGResponse, SearchXNGResult } from '@/lib/types'

export const searchTool = ({ uiStream, fullResponse }: ToolProps) =>
tool({
Expand All @@ -28,36 +28,35 @@ export const searchTool = ({ uiStream, fullResponse }: ToolProps) =>
/>
)

// Tavily API requires a minimum of 5 characters in the query
// Ensure minimum query length for all APIs
const filledQuery =
query.length < 5 ? query + ' '.repeat(5 - query.length) : query
let searchResult
const searchAPI: 'tavily' | 'exa' = 'tavily'
let searchResult: SearchResults
const searchAPI = (process.env.SEARCH_API as 'tavily' | 'exa' | 'searchxng') || 'tavily'
console.log(`Using search API: ${searchAPI}`)

try {
searchResult =
searchResult = await (
searchAPI === 'tavily'
? await tavilySearch(
filledQuery,
max_results,
search_depth,
include_domains,
exclude_domains
)
: await exaSearch(query)
? tavilySearch
: searchAPI === 'exa'
? exaSearch
: searchXNGSearch
)(filledQuery, max_results, search_depth, include_domains, exclude_domains)
} catch (error) {
console.error('Search API error:', error)
hasError = true
searchResult = { results: [], query: filledQuery, images: [], number_of_results: 0 }
}

if (hasError) {
fullResponse = `An error occurred while searching for "${query}.`
fullResponse = `An error occurred while searching for "${filledQuery}".`
uiStream.update(null)
streamResults.done()
return searchResult
}

streamResults.done(JSON.stringify(searchResult))

return searchResult
}
})
Expand All @@ -68,8 +67,12 @@ async function tavilySearch(
searchDepth: 'basic' | 'advanced' = 'basic',
includeDomains: string[] = [],
excludeDomains: string[] = []
): Promise<any> {
): Promise<SearchResults> {
const apiKey = process.env.TAVILY_API_KEY
if (!apiKey) {
throw new Error('TAVILY_API_KEY is not set in the environment variables')
}

const response = await fetch('https://api.tavily.com/search', {
method: 'POST',
headers: {
Expand All @@ -78,7 +81,7 @@ async function tavilySearch(
body: JSON.stringify({
api_key: apiKey,
query,
max_results: maxResults < 5 ? 5 : maxResults,
max_results: Math.max(maxResults, 5),
search_depth: searchDepth,
include_images: true,
include_answers: true,
Expand All @@ -88,31 +91,114 @@ async function tavilySearch(
})

if (!response.ok) {
throw new Error(`Error: ${response.status}`)
throw new Error(`Tavily API error: ${response.status} ${response.statusText}`)
}

// sanitize the image urls
const data = await response.json()
const sanitizedData: SearchResults = {
return {
...data,
images: data.images.map((url: any) => sanitizeUrl(url))
images: data.images.map((url: string) => sanitizeUrl(url))
}

return sanitizedData
}

async function exaSearch(
query: string,
maxResults: number = 10,
_searchDepth: string,
includeDomains: string[] = [],
excludeDomains: string[] = []
): Promise<any> {
): Promise<SearchResults> {
const apiKey = process.env.EXA_API_KEY
if (!apiKey) {
throw new Error('EXA_API_KEY is not set in the environment variables')
}

const exa = new Exa(apiKey)
return exa.searchAndContents(query, {
const exaResults = await exa.searchAndContents(query, {
highlights: true,
numResults: maxResults,
includeDomains,
excludeDomains
})

return {
results: exaResults.results.map((result: any) => ({
title: result.title,
url: result.url,
content: result.highlight || result.text
})),
query,
images: [],
number_of_results: exaResults.results.length
}
}

async function searchXNGSearch(
query: string,
maxResults: number = 10,
_searchDepth: string,
includeDomains: string[] = [],
excludeDomains: string[] = []
): Promise<SearchResults> {
const apiUrl = process.env.SEARCHXNG_API_URL
if (!apiUrl) {
throw new Error('SEARCHXNG_API_URL is not set in the environment variables')
}

try {
// Construct the URL with query parameters
const url = new URL(`${apiUrl}/search`)
url.searchParams.append('q', query)
url.searchParams.append('format', 'json')
url.searchParams.append('max_results', maxResults.toString())
// Enable both general and image results
url.searchParams.append('categories', 'general,images')
// Add domain filters if specified
if (includeDomains.length > 0) {
url.searchParams.append('include_domains', includeDomains.join(','))
}
if (excludeDomains.length > 0) {
url.searchParams.append('exclude_domains', excludeDomains.join(','))
}
// Fetch results from SearchXNG
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Accept': 'application/json'
}
})

if (!response.ok) {
const errorText = await response.text()
console.error(`SearchXNG API error (${response.status}):`, errorText)
throw new Error(`SearchXNG API error: ${response.status} ${response.statusText} - ${errorText}`)
}

const data: SearchXNGResponse = await response.json()
//console.log('SearchXNG API response:', JSON.stringify(data, null, 2))

// Separate general results and image results
const generalResults = data.results.filter(result => !result.img_src)
const imageResults = data.results.filter(result => result.img_src)

// Format the results to match the expected SearchResults structure
return {
results: generalResults.map((result: SearchXNGResult): SearchResultItem => ({
title: result.title,
url: result.url,
content: result.content
})),
query: data.query,
images: imageResults.map(result => {
const imgSrc = result.img_src || '';
// If image_proxy is disabled, img_src should always be a full URL
// If it's enabled, it might be a relative URL
return imgSrc.startsWith('http') ? imgSrc : `${apiUrl}${imgSrc}`
}).filter(Boolean), // Remove any empty strings
number_of_results: data.number_of_results
}
} catch (error) {
console.error('SearchXNG API error:', error)
throw error
}
}
14 changes: 14 additions & 0 deletions lib/types/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ export type SearchResults = {
images: string[]
results: SearchResultItem[]
query: string
number_of_results?: number
}

export type ExaSearchResults = {
Expand Down Expand Up @@ -70,3 +71,16 @@ export type AIMessage = {
| 'followup'
| 'end'
}

export interface SearchXNGResult {
title: string;
url: string;
content: string;
img_src?: string;
}

export interface SearchXNGResponse {
query: string;
number_of_results: number;
results: SearchXNGResult[];
}
1 change: 1 addition & 0 deletions searxng-limiter.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#https://docs.searxng.org/admin/searx.limiter.html
Loading

0 comments on commit ddd1669

Please sign in to comment.