|
| 1 | +# Polars Scanner – `alib` → `library.db` Mapping |
| 2 | + |
| 3 | +The Python-based scanner consumes the staging table `alib` (produced by `tags2db-polars-multidrive-optimised.py`) and reproduces the database layout that the legacy Perl scanner leaves in `library.db`. The mapping below documents how every field is derived. Column names prefixed with `__` come directly from the staging table, while others are tag fields extracted by `audioinfo`. |
| 4 | + |
| 5 | +## Tracks (`tracks`) |
| 6 | + |
| 7 | +| `tracks` column | Source in `alib` | Notes | |
| 8 | +| --- | --- | --- | |
| 9 | +| `url` | `fileURL(__path)` | Stored as `file://` URL (same encoding as `Slim::Utils::Misc::fileURLFromPath`). | |
| 10 | +| `title` | `title` | Raw tag value. |
| 11 | +| `titlesort` | `title` | `Slim::Utils::Text::ignoreCaseArticles()` equivalent (upper-case, articles stripped, punctuation removed). |
| 12 | +| `titlesearch` | `title` | `ignoreCase()` equivalent (case-insensitive, no article stripping). |
| 13 | +| `customsearch` | `title` + `subtitle` | Concatenated and normalized identical to Perl scanner. |
| 14 | +| `album` | FK produced from `album` table mapping | Deterministic key: (`album`, `albumartist`, `musicbrainz_albumid`, `discnumber`, `compilation`). |
| 15 | +| `tracknum` | `track` | Parsed to integer; zero if missing. |
| 16 | +| `content_type` | `__filetype` | Lower-case; same values server uses (e.g., `flc`, `mp3`). |
| 17 | +| `timestamp` | `__file_mod_datetime_raw` | Seconds since epoch; float accepted. |
| 18 | +| `filesize` | `__file_size_bytes` | Integer bytes. |
| 19 | +| `audio_size` | `__file_size_bytes` | Same as `filesize` when full file scanned. |
| 20 | +| `audio_offset` | `0` | Not tracked in staging data. |
| 21 | +| `year` | `year` or fallback `originalyear` | Prefer release year; integer. |
| 22 | +| `secs` | `__length_seconds` | Float seconds with millisecond precision. |
| 23 | +| `cover` | `NULL` | Artwork from file is handled later by LMS. |
| 24 | +| `vbr_scale` | `__bitrate` | Retained as text description. |
| 25 | +| `bitrate` | `__bitrate_num` | Numeric kbps; converted from string if needed. |
| 26 | +| `samplerate` | `__frequency_num` | Integer Hz. |
| 27 | +| `samplesize` | `__bitspersample` | Integer. |
| 28 | +| `channels` | `__channels` | Integer. |
| 29 | +| `block_alignment` | `NULL` | Not exposed by `alib`. |
| 30 | +| `endian` | `NULL` | Not exposed. |
| 31 | +| `bpm` | `__bpm` if present | New tag fallback. |
| 32 | +| `tagversion` | `__version` | |
| 33 | +| `drm` | `explicit` flag | 1 for DRM/explicit when flagged. |
| 34 | +| `disc` | `disc` or `discnumber` | Integer disc index. |
| 35 | +| `audio` | `1` | All staged files are audio. |
| 36 | +| `remote` | `0` | Staging only contains local files. |
| 37 | +| `lossless` | Derived from extension (`flac`,`alac`,`wv`,`aiff`,`ape`) | Boolean flag. |
| 38 | +| `lyrics` | `lyrics` or `unsyncedlyrics` | UTF-8 text. |
| 39 | +| `musicbrainz_id` | `musicbrainz_trackid` | 36-char UUID. |
| 40 | +| `musicmagic_mixable` | `analysis` | Set to 1 when the `analysis` tag is non-null, otherwise 0. |
| 41 | +| `replay_gain` | `replaygain_track_gain` | Normalized float. |
| 42 | +| `replay_peak` | `replaygain_track_peak` | Float. |
| 43 | +| `extid` | `tagminder_uuid` | Custom stable GUID for track. |
| 44 | +| `urlmd5` | `md5(url)` | 32-char hex, same as Perl scanner. |
| 45 | +| `coverid` | `NULL` | Populated later by LMS artwork pipeline. |
| 46 | +| `cover_cached` | `NULL` | |
| 47 | +| `virtual` | `0` | Only physical files handled. |
| 48 | +| `added_time` | `__file_mod_datetime_raw` | Mirrors `updated_time` like the Perl scanner. |
| 49 | +| `updated_time` | `__file_mod_datetime_raw` | Same as `timestamp`. |
| 50 | + |
| 51 | +Any column not backed by staging data is set to `NULL` to allow LMS to fill it later. |
| 52 | + |
| 53 | +## Albums (`albums`) |
| 54 | + |
| 55 | +| Column | Source | Notes | |
| 56 | +| --- | --- | --- | |
| 57 | +| `title` | `album` | |
| 58 | +| `titlesort` | `album` | `ignoreCaseArticles()`. |
| 59 | +| `titlesearch` | `album` | `ignoreCase()`. |
| 60 | +| `customsearch` | `album` + `albumartist` | Same logic as Perl scanner. |
| 61 | +| `compilation` | `compilation` tag or heuristics | 1 if tagged or if there are multiple track artists and no albumartist tag. |
| 62 | +| `year` | `year` or `originalyear` | |
| 63 | +| `artwork` | `NULL` | Linked once tracks inserted. |
| 64 | +| `disc` | minimum disc number | |
| 65 | +| `discc` | maximum disc count across album | |
| 66 | +| `replay_gain` / `replay_peak` | album-level replaygain tags | |
| 67 | +| `musicbrainz_id` | `musicbrainz_albumid` | |
| 68 | +| `musicmagic_mixable` | `amgtagged` | |
| 69 | +| `contributor` | FK to album-artist `contributors.id` | Derived from albumartist tag or fallback to track artist. |
| 70 | + |
| 71 | +Albums are keyed by (`title`,`albumartist`,`musicbrainz_albumid`,`discc`,`compilation`). |
| 72 | + |
| 73 | +## Contributors (`contributors`) |
| 74 | + |
| 75 | +Names are taken from the respective tag fields (artist, albumartist, composer, conductor, etc.). For each role we: |
| 76 | + |
| 77 | +1. Split multi-value tags using the `\\` literal delimiter (same as staging file). |
| 78 | +2. Normalize `namesort` (`ignoreCaseArticles`), `namesearch` (`ignoreCase`), and transliterate to ASCII for indexes. |
| 79 | +3. Store MusicBrainz IDs where available (e.g., `musicbrainz_artistid`, `musicbrainz_albumartistid`, etc.). |
| 80 | + |
| 81 | +Roles follow the LMS internal IDs: |
| 82 | + |
| 83 | +| Role | ID | `alib` column(s) | |
| 84 | +| --- | --- | --- | |
| 85 | +| ARTIST | 1 | `artist` | |
| 86 | +| COMPOSER | 2 | `composer` | |
| 87 | +| CONDUCTOR | 3 | `conductor` | |
| 88 | +| BAND | 4 | `ensemble`, `band`, `orchestra` | |
| 89 | +| ALBUMARTIST | 5 | `albumartist` | |
| 90 | +| TRACKARTIST | 6 | `artist` (when distinct per track) | |
| 91 | + |
| 92 | +Custom roles defined in `userDefinedRoles` (from `server.prefs`) are appended, preserving their IDs (≥ 21) so plugins continue to work. |
| 93 | + |
| 94 | +`contributor_track` receives one row per `(track_id, role_id, contributor_id)` triple, and `contributor_album` aggregates `(album_id, role_id, contributor_id)` to mirror the Perl scanner’s behavior. |
| 95 | + |
| 96 | +## Genres (`genres`, `genre_track`) |
| 97 | + |
| 98 | +Tags `genre` and `style` are combined. Each multi-value tag is split on the `\\` delimiter, normalized using the same case-folding as contributors, inserted (deduped) into `genres`, and related to tracks through `genre_track`. `mood` and `theme` stay as plain tags and are not mapped to `genres`. |
| 99 | + |
| 100 | +## Playlist & Comments |
| 101 | + |
| 102 | +When `alib` exposes playlist or comment metadata: |
| 103 | + |
| 104 | +- `playlist_track` gets rebuilt from staged playlist rows (path stored under `__path` with `__tag == 'playlist'`). For now the scanner preserves existing playlist rows because staging focuses on audio files; LMS will rebuild playlists on demand. |
| 105 | +- `comments` table receives `review`/`lyrics` style annotations keyed by track ID. |
| 106 | + |
| 107 | +## Virtual Libraries & Release Types |
| 108 | + |
| 109 | +After the core tables are populated: |
| 110 | + |
| 111 | +1. `library_track`, `library_album`, `library_contributor`, and `library_genre` are regenerated when virtual libraries are enabled (matching the SQL found in `Slim/Music/VirtualLibraries.pm`). |
| 112 | +2. Release type hints run the same queries as `Slim::Music::ReleaseTypes` to update `albums.release_type` for Singles/EPs. |
| 113 | + |
| 114 | +## Full Text Search (optional) |
| 115 | + |
| 116 | +If the Full Text Search plugin is enabled (`plugin.fulltext` pref), the scanner rebuilds `fulltext` and `fulltext_terms` with the SQL templates embedded in `Slim::Plugin::FullTextSearch::Plugin`. The resulting tables are byte-for-byte compatible with the plugin’s importer, meaning LMS will not trigger another FTS rebuild when it starts. |
| 117 | + |
| 118 | +--- |
| 119 | + |
| 120 | +This mapping guarantees that every table `scanner.pl` touches is populated with equivalent data, enabling the new Polars-based scanner to be a drop-in replacement. |
0 commit comments