diff options
| author | Michelle Tilley <michelle@michelletilley.net> | 2026-02-26 14:42:08 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2026-02-26 14:42:08 -0800 |
| commit | 3ba47446f06d5b0fbeaeb59d4ffed768b70729d8 (patch) | |
| tree | 28348bb3d18e9983e9212c26840f691766cad985 /crates/atuin-client/src | |
| parent | feat: Add history author/intent metadata and v1 record version (#3205) (diff) | |
| download | atuin-3ba47446f06d5b0fbeaeb59d4ffed768b70729d8.zip | |
feat: In-memory search index with atuin daemon (#3201)
## Summary
This PR adds a persistent, in-memory search index to the Atuin daemon,
enabling fast fuzzy search without the startup delay of building an
index each time the TUI opens.
### Key Changes
- **Daemon search service**: A new gRPC service that maintains a Nucleo
fuzzy search index in memory
- **Real-time index updates**: The daemon listens for history events
(new commands, synced records) and updates the index immediately
- **Filter mode support**: All existing filter modes work (Global, Host,
Session, Directory, Workspace)
- **New search engine**: `daemon-fuzzy` search mode that queries the
daemon instead of building a local index
- **Paged history loading**: Database pagination support for efficient
initial index loading
- **Configurable logging**: New `[logs]` settings section for daemon and
search log configuration
- **Component-based daemon architecture**: Refactored daemon internals
into a modular, event-driven system
- **Fallback to DB search for regex**: Since Nucleo doesn't support
regex matching
## Daemon Architecture
The daemon has been refactored to use a component-based, event-driven
architecture that makes it easier to add new functionality and reason
about the system.
### Core Concepts
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Atuin Daemon │
│ │
│ ┌─────────────┐ ┌──────────────────────────────────────────────────┐ │
│ │ Daemon │ │ Components │ │
│ │ Handle │────▶│ │ │
│ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ • emit() │ │ │ History │ │ Search │ │ Sync │ │ │
│ │ • subscribe │ │ │ Component │ │ Component │ │ Component │ │ │
│ │ • settings │ │ │ │ │ │ │ │ │ │
│ │ • databases │ │ │ gRPC service│ │ gRPC service│ │ background │ │ │
│ └─────────────┘ │ │ WIP history │ │ Nucleo index│ │ sync │ │ │
│ │ │ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ │ └──────────────────────────────────────────────────┘ │
│ │ ▲ │
│ ▼ │ │
│ ┌─────────────────────────────────────┴────────────────────────────────┐ │
│ │ Event Bus (broadcast) │ │
│ │ │ │
│ │ HistoryStarted │ HistoryEnded │ RecordsAdded │ SyncCompleted │ ... │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ ▲ │
│ ┌───────────────────────────────────┴──────────────────────────────────┐ │
│ │ Control Service (gRPC) │ │
│ │ External event injection from CLI commands │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### DaemonHandle
A lightweight, cloneable handle that provides access to shared daemon
resources:
- **Event emission**: `handle.emit(DaemonEvent::...)` broadcasts to all
components
- **Event subscription**: `handle.subscribe()` returns a receiver for
the event bus
- **Settings**: `handle.settings()` for configuration access
- **Databases**: `handle.history_db()` and `handle.store()` for data
access
### Component Trait
Components implement a simple lifecycle:
```rust
#[async_trait]
trait Component: Send + Sync {
fn name(&self) -> &'static str;
async fn start(&mut self, handle: DaemonHandle) -> Result<()>;
async fn handle_event(&mut self, event: &DaemonEvent) -> Result<()>;
async fn stop(&mut self) -> Result<()>;
}
```
### Event-Driven Design
Components communicate via events rather than direct coupling:
| Event | Emitted By | Consumed By |
|-------|-----------|-------------|
| `HistoryStarted` | History gRPC | Search (logging) |
| `HistoryEnded` | History gRPC | Search (index update) |
| `RecordsAdded` | Sync | Search (index update) |
| `HistoryPruned` | CLI (via Control) | Search (index rebuild) |
| `HistoryDeleted` | CLI (via Control) | Search (index rebuild) |
| `ForceSync` | CLI (via Control) | Sync |
| `ShutdownRequested` | Signal handler | All (graceful shutdown) |
### External Event Injection
CLI commands can inject events into a running daemon:
```rust
// After `atuin history prune`
emit_event(DaemonEvent::HistoryPruned).await?;
// After deleting specific items
emit_event(DaemonEvent::HistoryDeleted { ids }).await?;
// Request immediate sync
emit_event(DaemonEvent::ForceSync).await?;
```
This ensures the daemon's search index stays in sync with database
changes made by CLI commands.
## Search Architecture
The search service uses a [forked version of
Nucleo](https://github.com/atuinsh/nucleo-ext) that adds filter and
scorer callbacks, enabling efficient filtering and frecency-based
ranking.
```
┌────────────────────────────────────────────────────────────────┐
│ Atuin Daemon │
│ │
│ ┌─────────────────┐ ┌──────────────────────────────────┐ │
│ │ Event System │───▶│ Search Component │ │
│ │ │ │ │ │
│ │ • RecordsAdded │ │ ┌────────────────────────────┐ │ │
│ │ • HistoryEnded │ │ │ Deduplicated Index │ │ │
│ │ • HistoryPruned │ │ │ │ │ │
│ └─────────────────┘ │ │ CommandData per command: │ │ │
│ │ │ • Global frecency │ │ │
│ ┌─────────────────┐ │ │ • Filter indexes (sets) │ │ │
│ │ Background Task │ │ │ • Invocation history │ │ │
│ │ │ │ └────────────────────────────┘ │ │
│ │ Rebuilds │ │ │ │ │
│ │ frecency map │ │ ▼ │ │
│ │ every 60s │───▶│ ┌────────────────────────────┐ │ │
│ └─────────────────┘ │ │ Nucleo (forked) │ │ │
│ │ │ │ │ │
│ │ │ • Filter callback │ │ │
│ │ │ • Scorer callback │ │ │
│ │ │ • Fuzzy matching │ │ │
│ │ └────────────────────────────┘ │ │
│ └──────────────────────────────────┘ │
│ │ │
│ │ gRPC (Unix socket) │
└──────────────────────────────────────│─────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Search TUI (Client) │
│ │
│ 1. Send query + filter mode + context to daemon │
│ 2. Receive matching history IDs (ranked by frecency) │
│ 3. Hydrate full records from local SQLite database │
│ 4. Display results in TUI │
└─────────────────────────────────────────────────────────────────┘
```
### Nucleo Fork
The [nucleo-ext fork](https://github.com/atuinsh/nucleo-ext) adds two
key features to Nucleo:
1. **Filter callback**: Pre-filter items before fuzzy matching (used for
directory/host/session filtering)
2. **Scorer callback**: Compute custom scores after matching (used for
frecency ranking)
```rust
// Filter: only include commands run in current directory
nucleo.set_filter(Some(Arc::new(|cmd: &String| {
passing_commands.contains(cmd)
})));
// Scorer: combine fuzzy score with frecency
nucleo.set_scorer(Some(Arc::new(|cmd: &String, fuzzy_score: u32| {
let frecency = frecency_map.get(cmd).unwrap_or(0);
fuzzy_score + (frecency * 10)
})));
```
### Deduplicated Index
Commands are stored once per unique command text, with metadata tracking
all invocations:
```rust
struct CommandData {
command: String,
invocations: Vec<Invocation>, // All times this command was run
global_frecency: FrecencyData, // Precomputed frecency score
// O(1) filter indexes
directories: HashSet<String>, // All cwds where command was run
hosts: HashSet<String>, // All hostnames
sessions: HashSet<String>, // All session IDs
}
```
This deduplication means:
- **Fewer items to match**: ~13K unique commands vs ~62K history entries
- **O(1) filter checks**: HashSet lookups instead of scanning
invocations
- **Single frecency score**: Global frecency computed once, used for all
filter modes
### Frecency Scoring
Frecency (frequency + recency) scoring prioritizes recently and
frequently used commands:
```rust
fn compute_frecency(count: u32, last_used: i64, now: i64) -> u32 {
let age_hours = (now - last_used) / 3600;
// Recency: decays over time (half-life ~24 hours)
let recency = (100.0 * (-age_hours as f64 / 24.0).exp()) as u32;
// Frequency: logarithmic scaling
let frequency = (count.ln() * 20.0).min(100.0) as u32;
recency + frequency
}
```
The frecency map is:
- **Precomputed by background task** every 60 seconds
- **Never computed inline** during search (no latency impact)
- **Graceful fallback**: If unavailable, search works without frecency
ranking
### Filter Mode Implementation
| Filter Mode | Implementation |
|-------------|----------------|
| Global | No filter (all commands) |
| Directory | `command.directories.contains(cwd)` |
| Workspace | `command.directories.any(\|d\| d.starts_with(git_root))` |
| Host | `command.hosts.contains(hostname)` |
| Session | `command.sessions.contains(session_id)` |
Filters are pre-computed into a HashSet before the search, making the
filter callback O(1).
### Search Flow
1. **Daemon startup**: Loads history from SQLite in pages, builds
deduplicated index
2. **Frecency precompute**: Background task builds frecency map after
history loads
3. **Search request**: Client sends query with filter mode and context
4. **Filter**: Pre-computed HashSet determines which commands pass the
filter
5. **Match**: Nucleo fuzzy matches the query against command text
6. **Score**: Frecency scorer ranks results (fuzzy score + frecency *
10)
7. **Response**: Returns history IDs for the most recent invocation of
each matching command
8. **Hydration**: Client fetches full records from local SQLite
### Configuration
```toml
# Enable daemon + autostart
[daemon]
enabled = true
autostart = true
# Enable daemon-based fuzzy search
[search]
search_mode = "daemon-fuzzy"
```
## Performance
Performance varies based on several factors, but in most initial testing
with the new architecture shows improvement:
* **Nucleo performs searches up to 4.5x faster**: direct DB search
averages 18.07ms, but the daemon completes the same queries in 3.99ms.
* **IPC overhead is significant, but acceptable**: a significant amount
of wall-time is taken up by the transfer of data over IPC (via UDS in
this case). This averages to about ~7.8ms and accounts for 66% of
client-side wall time.
* **Tail latency improves at every layer**: p99 times correspond to
initial requests, worst-case query patterns, etc. but the average p99
daemon-based response time is 3.6x better than the associated DB-based
search p99 time
* **Query complexity no longer impacts performance**: the Nucleo-based
search shows consistent 2-7ms times regardless of query pattern. The
DB-based search had a 17x variance (3.59ms to 62.46ms).
Interestingly, @ellie - who has a larger history store than I do - gets
even better performance on the IPC layer. This could use a lot more
testing in various edge cases and on various hardware, but seems
promising.
### Regular DB search
```
Individual calls for: db_search
--------------------------------------------------------------------------------------------------------------
# Wall Busy Idle Fields
--------------------------------------------------------------------------------------------------------------
1 32.25ms 32.20ms 47.70µs {"mode":"Fuzzy","query":"^"}
2 19.48ms 19.40ms 84.20µs {"mode":"Fuzzy","query":"^c"}
3 20.40ms 20.10ms 297.00µs {"mode":"Fuzzy","query":"^ca"}
4 13.07ms 13.00ms 69.90µs {"mode":"Fuzzy","query":"^car"}
5 12.17ms 12.10ms 67.10µs {"mode":"Fuzzy","query":"^carg"}
6 20.78ms 20.70ms 76.60µs {"mode":"Fuzzy","query":"^cargo"}
7 9.15ms 9.10ms 53.20µs {"mode":"Fuzzy","query":"^cargo "}
8 10.24ms 10.00ms 237.00µs {"mode":"Fuzzy","query":"^cargo b"}
9 10.01ms 9.68ms 325.00µs {"mode":"Fuzzy","query":"^cargo bu"}
10 5.89ms 5.83ms 57.20µs {"mode":"Fuzzy","query":"^cargo bui"}
11 8.85ms 8.28ms 568.00µs {"mode":"Fuzzy","query":"^cargo buil"}
12 7.70ms 7.49ms 212.00µs {"mode":"Fuzzy","query":"^cargo build"}
13 3.59ms 3.53ms 57.00µs {"mode":"Fuzzy","query":"^cargo build$"}
14 6.50ms 6.44ms 63.60µs {"mode":"Fuzzy","query":"^cargo "}
15 6.48ms 6.38ms 100.00µs {"mode":"Fuzzy","query":"!"}
16 31.68ms 31.60ms 75.90µs {"mode":"Fuzzy","query":"!g"}
17 62.46ms 62.40ms 58.90µs {"mode":"Fuzzy","query":"!gi"}
18 30.35ms 30.30ms 46.90µs {"mode":"Fuzzy","query":"!git"}
19 53.84ms 53.80ms 40.80µs {"mode":"Fuzzy","query":"!git "}
20 19.24ms 19.20ms 39.70µs {"mode":"Fuzzy","query":"!git c"}
21 22.03ms 22.00ms 34.70µs {"mode":"Fuzzy","query":"!git co"}
22 17.13ms 17.00ms 133.00µs {"mode":"Fuzzy","query":"!git com"}
23 16.14ms 15.90ms 242.00µs {"mode":"Fuzzy","query":"!git comm"}
24 5.11ms 5.08ms 28.60µs {"mode":"Fuzzy","query":"!git commi"}
25 7.31ms 7.26ms 52.70µs {"mode":"Fuzzy","query":"!git commit"}
Summary: 25 calls
Wall: avg=18.07ms, min=3.59ms, max=62.46ms, p50=13.07ms, p99=62.46ms
Busy: avg=17.95ms, min=3.53ms, max=62.40ms, p50=13.00ms, p99=62.40ms
```
### Daemon-based search
**Client**
```
Individual calls for: daemon_search
--------------------------------------------------------------------------------------------------------------
# Wall Busy Idle Fields
--------------------------------------------------------------------------------------------------------------
1 13.05ms 2.55ms 10.50ms {"query":"^"}
2 10.65ms 1.40ms 9.25ms {"query":"^c"}
3 10.72ms 1.18ms 9.54ms {"query":"^ca"}
4 5.54ms 485.00µs 5.06ms {"query":"^car"}
5 15.02ms 1.02ms 14.00ms {"query":"^carg"}
6 9.49ms 840.00µs 8.65ms {"query":"^cargo"}
7 5.53ms 555.00µs 4.97ms {"query":"^cargo "}
8 8.56ms 717.00µs 7.84ms {"query":"^cargo b"}
9 12.34ms 1.24ms 11.10ms {"query":"^cargo bu"}
10 8.38ms 650.00µs 7.73ms {"query":"^cargo bui"}
11 13.07ms 770.00µs 12.30ms {"query":"^cargo buil"}
12 17.11ms 709.00µs 16.40ms {"query":"^cargo build"}
13 15.41ms 907.00µs 14.50ms {"query":"^cargo build$"}
14 8.19ms 665.00µs 7.52ms {"query":"^cargo "}
15 7.98ms 1.72ms 6.26ms {"query":"!"}
16 13.56ms 856.00µs 12.70ms {"query":"!g"}
17 8.11ms 624.00µs 7.49ms {"query":"!gi"}
18 14.57ms 775.00µs 13.80ms {"query":"!git"}
19 14.18ms 779.00µs 13.40ms {"query":"!git "}
20 9.62ms 802.00µs 8.82ms {"query":"!git c"}
21 15.50ms 1.50ms 14.00ms {"query":"!git co"}
22 11.58ms 1.48ms 10.10ms {"query":"!git com"}
23 13.82ms 2.12ms 11.70ms {"query":"!git comm"}
24 17.48ms 2.18ms 15.30ms {"query":"!git commi"}
25 14.81ms 1.71ms 13.10ms {"query":"!git commit"}
Summary: 25 calls
Wall: avg=11.77ms, min=5.53ms, max=17.48ms, p50=12.34ms, p99=17.48ms
Busy: avg=1.13ms, min=485.00µs, max=2.55ms, p50=856.00µs, p99=2.55ms
```
**Daemon**
```
Individual calls for: daemon_search_query
--------------------------------------------------------------------------------------------------------------
# Wall Busy Idle Fields
--------------------------------------------------------------------------------------------------------------
1 1.75ms 250ns 1.75ms {"query":"^","query_id":1}
2 4.58ms 125ns 4.58ms {"query":"^c","query_id":2}
3 4.39ms 250ns 4.39ms {"query":"^ca","query_id":3}
4 2.52ms 125ns 2.52ms {"query":"^car","query_id":4}
5 4.44ms 250ns 4.44ms {"query":"^carg","query_id":5}
6 3.66ms 167ns 3.66ms {"query":"^cargo","query_id":6}
7 2.38ms 84ns 2.38ms {"query":"^cargo ","query_id":7}
8 4.13ms 84ns 4.13ms {"query":"^cargo b","query_id":8}
9 4.40ms 167ns 4.40ms {"query":"^cargo bu","query_id":9}
10 3.87ms 125ns 3.87ms {"query":"^cargo bui","query_id":10}
11 4.36ms 84ns 4.36ms {"query":"^cargo buil","query_id":11}
12 3.96ms 333ns 3.96ms {"query":"^cargo build","query_id":12}
13 4.61ms 167ns 4.61ms {"query":"^cargo build$","query_id":13}
14 4.20ms 209ns 4.20ms {"query":"^cargo ","query_id":14}
15 238.17µs 167ns 238.00µs {"query":"!","query_id":15}
16 4.44ms 125ns 4.44ms {"query":"!g","query_id":16}
17 3.47ms 83ns 3.47ms {"query":"!gi","query_id":17}
18 4.57ms 125ns 4.57ms {"query":"!git","query_id":18}
19 7.15ms 167ns 7.15ms {"query":"!git ","query_id":19}
20 4.27ms 250ns 4.27ms {"query":"!git c","query_id":20}
21 5.19ms 292ns 5.19ms {"query":"!git co","query_id":21}
22 4.29ms 417ns 4.29ms {"query":"!git com","query_id":22}
23 4.08ms 125ns 4.08ms {"query":"!git comm","query_id":23}
24 4.50ms 167ns 4.50ms {"query":"!git commi","query_id":24}
25 4.35ms 208ns 4.35ms {"query":"!git commit","query_id":25}
Summary: 25 calls
Wall: avg=3.99ms, min=238.17µs, max=7.15ms, p50=4.29ms, p99=7.15ms
Busy: avg=182ns, min=83ns, max=417ns, p50=167ns, p99=417ns
```
**Nucleo matching time (in daemon)**
```
Individual calls for: nucleo_match
--------------------------------------------------------------------------------------------------------------
# Wall Busy Idle Fields
--------------------------------------------------------------------------------------------------------------
1 1.73ms 125ns 1.73ms {"query":"^","query_id":1}
2 4.57ms 167ns 4.57ms {"query":"^c","query_id":2}
3 4.37ms 125ns 4.37ms {"query":"^ca","query_id":3}
4 2.51ms 84ns 2.51ms {"query":"^car","query_id":4}
5 4.43ms 125ns 4.43ms {"query":"^carg","query_id":5}
6 3.64ms 125ns 3.64ms {"query":"^cargo","query_id":6}
7 2.37ms 84ns 2.37ms {"query":"^cargo ","query_id":7}
8 4.11ms 125ns 4.11ms {"query":"^cargo b","query_id":8}
9 4.36ms 208ns 4.36ms {"query":"^cargo bu","query_id":9}
10 3.85ms 125ns 3.85ms {"query":"^cargo bui","query_id":10}
11 4.35ms 125ns 4.35ms {"query":"^cargo buil","query_id":11}
12 3.94ms 250ns 3.94ms {"query":"^cargo build","query_id":12}
13 4.59ms 125ns 4.59ms {"query":"^cargo build$","query_id":13}
14 4.18ms 84ns 4.18ms {"query":"^cargo ","query_id":14}
15 220.13µs 125ns 220.00µs {"query":"!","query_id":15}
16 4.43ms 125ns 4.43ms {"query":"!g","query_id":16}
17 3.45ms 125ns 3.45ms {"query":"!gi","query_id":17}
18 4.55ms 125ns 4.55ms {"query":"!git","query_id":18}
19 7.12ms 209ns 7.12ms {"query":"!git ","query_id":19}
20 4.25ms 166ns 4.25ms {"query":"!git c","query_id":20}
21 5.18ms 125ns 5.18ms {"query":"!git co","query_id":21}
22 4.27ms 125ns 4.27ms {"query":"!git com","query_id":22}
23 4.06ms 292ns 4.06ms {"query":"!git comm","query_id":23}
24 4.46ms 166ns 4.46ms {"query":"!git commi","query_id":24}
25 4.31ms 208ns 4.31ms {"query":"!git commit","query_id":25}
Summary: 25 calls
Wall: avg=3.97ms, min=220.13µs, max=7.12ms, p50=4.27ms, p99=7.12ms
Busy: avg=147ns, min=84ns, max=292ns, p50=125ns, p99=292ns
```
Diffstat (limited to 'crates/atuin-client/src')
| -rw-r--r-- | crates/atuin-client/src/database.rs | 192 | ||||
| -rw-r--r-- | crates/atuin-client/src/hub.rs | 20 | ||||
| -rw-r--r-- | crates/atuin-client/src/settings.rs | 221 |
3 files changed, 431 insertions, 2 deletions
diff --git a/crates/atuin-client/src/database.rs b/crates/atuin-client/src/database.rs index 5f292bec..7c63368d 100644 --- a/crates/atuin-client/src/database.rs +++ b/crates/atuin-client/src/database.rs @@ -138,9 +138,13 @@ pub trait Database: Send + Sync + 'static { async fn all_with_count(&self) -> Result<Vec<(History, i32)>>; + fn all_paged(&self, page_size: usize, include_deleted: bool, unique: bool) -> Paged; + async fn stats(&self, h: &History) -> Result<HistoryStats>; async fn get_dups(&self, before: i64, dupkeep: u32) -> Result<Vec<History>>; + + fn clone_boxed(&self) -> Box<dyn Database + 'static>; } // Intended for use on a developer machine and not a sync server. @@ -650,6 +654,10 @@ impl Database for Sqlite { Ok(res) } + fn all_paged(&self, page_size: usize, include_deleted: bool, unique: bool) -> Paged { + Paged::new(Box::new(self.clone()), page_size, include_deleted, unique) + } + // deleted_at doesn't mean the actual time that the user deleted it, // but the time that the system marks it as deleted async fn delete(&self, mut h: History) -> Result<()> { @@ -814,6 +822,70 @@ impl Database for Sqlite { Ok(res) } + + fn clone_boxed(&self) -> Box<dyn Database + 'static> { + Box::new(self.clone()) + } +} + +pub struct Paged { + database: Box<dyn Database + 'static>, + page_size: usize, + last_id: Option<String>, + include_deleted: bool, + unique: bool, +} + +impl Paged { + pub fn new( + database: Box<dyn Database + 'static>, + page_size: usize, + include_deleted: bool, + unique: bool, + ) -> Self { + Self { + database, + page_size, + last_id: None, + include_deleted, + unique, + } + } + + pub async fn next(&mut self) -> Result<Option<Vec<History>>> { + let mut query = SqlBuilder::select_from(SqlName::new("history").alias("h").baquoted()); + + query.field("*").order_desc("id"); + + if !self.include_deleted { + query.and_where_is_null("deleted_at"); + } + + if self.unique { + // We want to deduplicate on command, but the user can search via cwd, hostname, and session. + // Without those fields, filter modes won't work right. With those fields, we get duplicates. + // This must be handled upstream. + query + .group_by("command, cwd, hostname, session") + .having("max(timestamp)"); + } + + query.limit(self.page_size); + + if let Some(last_id) = &self.last_id { + query.and_where_lt("id", quote(last_id)); + } + + let query = query.sql().expect("bug in list query. please report"); + let res = self.database.query_history(&query).await?; + + if res.is_empty() { + Ok(None) + } else { + self.last_id = Some(res.last().unwrap().id.0.clone()); + Ok(Some(res)) + } + } } trait SqlBuilderExt { @@ -1166,6 +1238,126 @@ mod test { } #[tokio::test(flavor = "multi_thread")] + async fn test_paged_basic() { + let mut db = Sqlite::new("sqlite::memory:", test_local_timeout()) + .await + .unwrap(); + + // Add 5 history items + for i in 0..5 { + new_history_item(&mut db, &format!("command{}", i)) + .await + .unwrap(); + } + + // Create a paged iterator with page_size of 2 + let mut paged = db.all_paged(2, false, false); + + // First page should have 2 items + let page1 = paged.next().await.unwrap(); + assert!(page1.is_some()); + assert_eq!(page1.unwrap().len(), 2); + + // Second page should have 2 items + let page2 = paged.next().await.unwrap(); + assert!(page2.is_some()); + assert_eq!(page2.unwrap().len(), 2); + + // Third page should have 1 item + let page3 = paged.next().await.unwrap(); + assert!(page3.is_some()); + assert_eq!(page3.unwrap().len(), 1); + + // Fourth page should be None (exhausted) + let page4 = paged.next().await.unwrap(); + assert!(page4.is_none()); + } + + #[tokio::test(flavor = "multi_thread")] + async fn test_paged_empty() { + let db = Sqlite::new("sqlite::memory:", test_local_timeout()) + .await + .unwrap(); + + // Create a paged iterator on empty database + let mut paged = db.all_paged(10, false, false); + + // Should return None immediately + let page = paged.next().await.unwrap(); + assert!(page.is_none()); + } + + #[tokio::test(flavor = "multi_thread")] + async fn test_paged_unique() { + let mut db = Sqlite::new("sqlite::memory:", test_local_timeout()) + .await + .unwrap(); + + // Add duplicate commands + new_history_item(&mut db, "duplicate").await.unwrap(); + new_history_item(&mut db, "duplicate").await.unwrap(); + new_history_item(&mut db, "unique1").await.unwrap(); + new_history_item(&mut db, "unique2").await.unwrap(); + + // Without unique flag - should get all 4 + let mut paged = db.all_paged(10, false, false); + let page = paged.next().await.unwrap().unwrap(); + assert_eq!(page.len(), 4); + + // With unique flag - should get 3 (duplicates collapsed) + let mut paged_unique = db.all_paged(10, false, true); + let page_unique = paged_unique.next().await.unwrap().unwrap(); + assert_eq!(page_unique.len(), 3); + } + + #[tokio::test(flavor = "multi_thread")] + async fn test_paged_include_deleted() { + let mut db = Sqlite::new("sqlite::memory:", test_local_timeout()) + .await + .unwrap(); + + // Add items + new_history_item(&mut db, "keep1").await.unwrap(); + new_history_item(&mut db, "keep2").await.unwrap(); + new_history_item(&mut db, "delete_me").await.unwrap(); + + // Delete one item + let all = db + .list( + &[], + &Context { + hostname: "".to_string(), + session: "".to_string(), + cwd: "".to_string(), + host_id: "".to_string(), + git_root: None, + }, + None, + false, + false, + ) + .await + .unwrap(); + + let to_delete = all + .iter() + .find(|h| h.command == "delete_me") + .unwrap() + .clone(); + db.delete(to_delete).await.unwrap(); + + // Without include_deleted - should get 2 + let mut paged = db.all_paged(10, false, false); + let page = paged.next().await.unwrap().unwrap(); + assert_eq!(page.len(), 2); + + // With include_deleted - should get 3 + let mut paged_deleted = db.all_paged(10, true, false); + let page_deleted = paged_deleted.next().await.unwrap().unwrap(); + assert_eq!(page_deleted.len(), 3); + } + + #[tokio::test(flavor = "multi_thread")] async fn test_search_bench_dupes() { let context = Context { hostname: "test:host".to_string(), diff --git a/crates/atuin-client/src/hub.rs b/crates/atuin-client/src/hub.rs index 5b34574b..b94c69ea 100644 --- a/crates/atuin-client/src/hub.rs +++ b/crates/atuin-client/src/hub.rs @@ -58,10 +58,14 @@ impl HubAuthSession { /// /// Returns a session containing the code and auth URL that the user should visit. pub async fn start(settings: &Settings) -> Result<Self> { + debug!("Starting Hub authentication process..."); + let code_response = request_code(&settings.hub_address) .await .context("Failed to request authentication code from Hub")?; + debug!("Received code from Hub"); + let code = code_response.code; let auth_url = format!("{}/auth/cli?code={}", settings.hub_address, code); @@ -79,8 +83,10 @@ impl HubAuthSession { match verify_code(&self.hub_address, &self.code).await { Ok(response) => { if let Some(token) = response.token { + debug!("Authentication complete, received token"); Ok(HubAuthStatus::Complete(token)) } else if let Some(error) = response.error { + error!("Authentication failed: {}", error); Ok(HubAuthStatus::Failed(error)) } else { Ok(HubAuthStatus::Pending) @@ -105,8 +111,11 @@ impl HubAuthSession { ) -> Result<String> { let start = std::time::Instant::now(); + debug!("Polling for Hub authentication completion..."); + loop { if start.elapsed() > timeout { + warn!("Authentication loop exited due to timeout"); bail!("Authentication timed out. Please try again."); } @@ -181,17 +190,21 @@ async fn handle_resp_error(resp: reqwest::Response) -> Result<reqwest::Response> let status = resp.status(); if status == StatusCode::SERVICE_UNAVAILABLE { + error!("Service unavailable: check https://status.atuin.sh"); bail!("Service unavailable: check https://status.atuin.sh"); } if status == StatusCode::TOO_MANY_REQUESTS { + error!("Rate limited; please wait before trying again"); bail!("Rate limited; please wait before trying again"); } if !status.is_success() { if let Ok(error) = resp.json::<ErrorResponse>().await { + error!("Hub error: {} - {}", status, error.reason); bail!("Hub error: {} - {}", status, error.reason); } + error!("Hub request failed with status: {}", status); bail!("Hub request failed with status: {}", status); } @@ -204,6 +217,8 @@ async fn request_code(address: &str) -> Result<CliCodeResponse> { let url = make_url(address, "/auth/cli/code")?; let client = reqwest::Client::new(); + debug!("Requesting code from Hub at {url}"); + let resp = client .post(&url) .header(USER_AGENT, APP_USER_AGENT) @@ -219,9 +234,12 @@ async fn request_code(address: &str) -> Result<CliCodeResponse> { /// Poll to verify the CLI auth code and get the session token async fn verify_code(address: &str, code: &str) -> Result<CliVerifyResponse> { ensure_crypto_provider(); - let url = make_url(address, &format!("/auth/cli/verify?code={}", code))?; + let base = make_url(address, "/auth/cli/verify")?; + let url = format!("{base}?code={code}"); let client = reqwest::Client::new(); + debug!("Verifying code with Hub at {base}?code=******"); + let resp = client .post(&url) .header(USER_AGENT, APP_USER_AGENT) diff --git a/crates/atuin-client/src/settings.rs b/crates/atuin-client/src/settings.rs index a15ce461..8e874832 100644 --- a/crates/atuin-client/src/settings.rs +++ b/crates/atuin-client/src/settings.rs @@ -42,6 +42,10 @@ pub enum SearchMode { #[serde(rename = "skim")] Skim, + + #[serde(rename = "daemon-fuzzy")] + #[clap(aliases = &["daemon-fuzzy"])] + DaemonFuzzy, } impl SearchMode { @@ -51,6 +55,7 @@ impl SearchMode { SearchMode::FullText => "FULLTXT", SearchMode::Fuzzy => "FUZZY", SearchMode::Skim => "SKIM", + SearchMode::DaemonFuzzy => "DAEMON", } } pub fn next(&self, settings: &Settings) -> Self { @@ -58,9 +63,13 @@ impl SearchMode { SearchMode::Prefix => SearchMode::FullText, // if the user is using skim, we go to skim SearchMode::FullText if settings.search_mode == SearchMode::Skim => SearchMode::Skim, + // if the user is using daemon-fuzzy, we go to daemon-fuzzy + SearchMode::FullText if settings.search_mode == SearchMode::DaemonFuzzy => { + SearchMode::DaemonFuzzy + } // otherwise fuzzy. SearchMode::FullText => SearchMode::Fuzzy, - SearchMode::Fuzzy | SearchMode::Skim => SearchMode::Prefix, + SearchMode::Fuzzy | SearchMode::Skim | SearchMode::DaemonFuzzy => SearchMode::Prefix, } } } @@ -477,6 +486,78 @@ pub struct Tmux { pub height: String, } +/// Log level for file logging. Maps to tracing's LevelFilter. +#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Deserialize, Serialize)] +#[serde(rename_all = "lowercase")] +pub enum LogLevel { + Trace, + Debug, + #[default] + Info, + Warn, + Error, +} + +impl LogLevel { + /// Convert to a tracing directive string for use with EnvFilter. + pub fn as_directive(&self) -> &'static str { + match self { + LogLevel::Trace => "trace", + LogLevel::Debug => "debug", + LogLevel::Info => "info", + LogLevel::Warn => "warn", + LogLevel::Error => "error", + } + } +} + +/// Configuration for a specific log type (search or daemon). +#[derive(Clone, Debug, Default, Deserialize, Serialize)] +pub struct LogConfig { + /// Log file name (relative to dir) or absolute path. + pub file: String, + + /// Override global enabled setting for this log type. + pub enabled: Option<bool>, + + /// Override global level setting for this log type. + pub level: Option<LogLevel>, + + /// Override global retention days setting for this log type. + pub retention: Option<u64>, +} + +#[derive(Clone, Debug, Deserialize, Serialize)] +pub struct Logs { + /// Enable file logging globally. Defaults to true. + #[serde(default = "Logs::default_enabled")] + pub enabled: bool, + + /// Directory for log files. Defaults to ~/.atuin/logs + pub dir: String, + + /// Default log level for file logging. Defaults to "info". + /// Note: ATUIN_LOG environment variable overrides this. + #[serde(default)] + pub level: LogLevel, + + /// Default retention days for log files. Defaults to 4. + #[serde(default = "Logs::default_retention")] + pub retention: u64, + + /// Search log settings + #[serde(default)] + pub search: LogConfig, + + /// Daemon log settings + #[serde(default)] + pub daemon: LogConfig, + + /// AI log settings + #[serde(default)] + pub ai: LogConfig, +} + #[derive(Default, Clone, Debug, Deserialize, Serialize)] pub struct Ai { /// The address of the Atuin AI endpoint. Used for AI features like command generation. @@ -523,6 +604,117 @@ impl Default for Daemon { } } +impl Default for Logs { + fn default() -> Self { + Self { + enabled: true, + dir: "".to_string(), + level: LogLevel::default(), + retention: Self::default_retention(), + search: LogConfig { + file: "search.log".to_string(), + ..Default::default() + }, + daemon: LogConfig { + file: "daemon.log".to_string(), + ..Default::default() + }, + ai: LogConfig { + file: "ai.log".to_string(), + ..Default::default() + }, + } + } +} + +impl Logs { + fn default_enabled() -> bool { + true + } + + fn default_retention() -> u64 { + 4 + } + + /// Returns whether search logging is enabled. + /// Uses search-specific setting if set, otherwise falls back to global. + pub fn search_enabled(&self) -> bool { + self.search.enabled.unwrap_or(self.enabled) + } + + /// Returns whether daemon logging is enabled. + /// Uses daemon-specific setting if set, otherwise falls back to global. + pub fn daemon_enabled(&self) -> bool { + self.daemon.enabled.unwrap_or(self.enabled) + } + + /// Returns whether AI logging is enabled. + /// Uses AI-specific setting if set, otherwise falls back to global. + pub fn ai_enabled(&self) -> bool { + self.ai.enabled.unwrap_or(self.enabled) + } + + /// Returns the log level for search logging. + /// Uses search-specific setting if set, otherwise falls back to global. + pub fn search_level(&self) -> LogLevel { + self.search.level.unwrap_or(self.level) + } + + /// Returns the log level for daemon logging. + /// Uses daemon-specific setting if set, otherwise falls back to global. + pub fn daemon_level(&self) -> LogLevel { + self.daemon.level.unwrap_or(self.level) + } + + /// Returns the log level for AI logging. + /// Uses AI-specific setting if set, otherwise falls back to global. + pub fn ai_level(&self) -> LogLevel { + self.ai.level.unwrap_or(self.level) + } + + /// Returns the retention days for search logging. + /// Uses search-specific setting if set, otherwise falls back to global. + pub fn search_retention(&self) -> u64 { + self.search.retention.unwrap_or(self.retention) + } + + /// Returns the retention days for daemon logging. + /// Uses daemon-specific setting if set, otherwise falls back to global. + pub fn daemon_retention(&self) -> u64 { + self.daemon.retention.unwrap_or(self.retention) + } + + /// Returns the retention days for AI logging. + /// Uses AI-specific setting if set, otherwise falls back to global. + pub fn ai_retention(&self) -> u64 { + self.ai.retention.unwrap_or(self.retention) + } + + /// Returns the full path for the search log file. + /// If `file` is an absolute path, returns it directly. + /// Otherwise, joins it with `dir`. + pub fn search_path(&self) -> PathBuf { + let path = PathBuf::from(&self.search.file); + if path.is_absolute() { + path + } else { + PathBuf::from(&self.dir).join(path) + } + } + + /// Returns the full path for the daemon log file. + /// If `file` is an absolute path, returns it directly. + /// Otherwise, joins it with `dir`. + pub fn daemon_path(&self) -> PathBuf { + let path = PathBuf::from(&self.daemon.file); + if path.is_absolute() { + path + } else { + PathBuf::from(&self.dir).join(path) + } + } +} + impl Default for Search { fn default() -> Self { Self { @@ -849,6 +1041,9 @@ pub struct Settings { pub tmux: Tmux, #[serde(default)] + pub logs: Logs, + + #[serde(default)] pub meta: meta::Settings, #[serde(default)] @@ -1033,6 +1228,7 @@ impl Settings { let scripts_path = data_dir.join("scripts.db"); let socket_path = atuin_common::utils::runtime_dir().join("atuin.sock"); let pidfile_path = data_dir.join("atuin-daemon.pid"); + let logs_dir = atuin_common::utils::logs_dir(); let key_path = data_dir.join("key"); let meta_path = data_dir.join("meta.db"); @@ -1101,6 +1297,12 @@ impl Settings { .set_default("daemon.pidfile_path", pidfile_path.to_str())? .set_default("daemon.systemd_socket", false)? .set_default("daemon.tcp_port", 8889)? + .set_default("logs.enabled", true)? + .set_default("logs.dir", logs_dir.to_str())? + .set_default("logs.level", "info")? + .set_default("logs.search.file", "search.log")? + .set_default("logs.daemon.file", "daemon.log")? + .set_default("logs.ai.file", "ai.log")? .set_default("kv.db_path", kv_path.to_str())? .set_default("scripts.db_path", scripts_path.to_str())? .set_default("meta.db_path", meta_path.to_str())? @@ -1218,6 +1420,9 @@ impl Settings { settings.key_path = Self::expand_path(settings.key_path)?; settings.daemon.socket_path = Self::expand_path(settings.daemon.socket_path)?; settings.daemon.pidfile_path = Self::expand_path(settings.daemon.pidfile_path)?; + settings.logs.dir = Self::expand_path(settings.logs.dir)?; + settings.logs.search.file = Self::expand_path(settings.logs.search.file)?; + settings.logs.daemon.file = Self::expand_path(settings.logs.daemon.file)?; // Validate UI settings settings.ui.validate()?; @@ -1264,6 +1469,20 @@ impl Default for Settings { } } +/// Initialize the meta store configuration for testing. +/// +/// This should only be used in tests. It allows tests to bypass the normal +/// Settings::new() flow while still being able to use Settings::host_id() +/// and other meta store dependent functions. +/// +/// # Safety +/// This function is not thread-safe with concurrent calls to Settings::new() +/// or other meta store initialization. Only call from tests. +#[doc(hidden)] +pub fn init_meta_config_for_testing(meta_db_path: impl Into<String>, local_timeout: f64) { + META_CONFIG.set((meta_db_path.into(), local_timeout)).ok(); +} + #[cfg(test)] pub(crate) fn test_local_timeout() -> f64 { std::env::var("ATUIN_TEST_LOCAL_TIMEOUT") |
