{"id":"c92760ee-e31f-4996-8157-664c60fa0a58","shortId":"EDJaHd","kind":"skill","title":"Apple On Device Ai","tagline":"Swift Ios Skills skill by Dpearson2699","description":"# On-Device AI for Apple Platforms\n\nGuide for selecting, deploying, and optimizing on-device ML models. Covers Apple\nFoundation Models, Core ML, MLX Swift, and llama.cpp.\n\n## Contents\n\n- [Framework Selection Router](#framework-selection-router)\n- [Apple Foundation Models Overview](#apple-foundation-models-overview)\n- [Core ML Overview](#core-ml-overview)\n- [MLX Swift Overview](#mlx-swift-overview)\n- [Multi-Backend Architecture](#multi-backend-architecture)\n- [Performance Best Practices](#performance-best-practices)\n- [Common Mistakes](#common-mistakes)\n- [Review Checklist](#review-checklist)\n- [References](#references)\n\n## Framework Selection Router\n\nUse this decision tree to pick the right framework for your use case.\n\n### Apple Foundation Models\n\n**When to use:** Text generation, summarization, entity extraction, structured\noutput, and short dialog on iOS 26+ / macOS 26+ devices with Apple Intelligence\nenabled. Zero setup -- no API keys, no network, no model downloads.\n\n**Best for:**\n- Generating text or structured data with `@Generable` types\n- Summarization, classification, content tagging\n- Tool-augmented generation with the `Tool` protocol\n- Apps that need guaranteed on-device privacy\n\n**Not suited for:** Complex math, code generation, factual accuracy tasks,\nor apps targeting pre-iOS 26 devices.\n\n### Core ML\n\n**When to use:** Deploying custom trained models (vision, NLP, audio) across all\nApple platforms. Converting models from PyTorch, TensorFlow, or scikit-learn\nwith coremltools.\n\n**Best for:**\n- Image classification, object detection, segmentation\n- Custom NLP classifiers, sentiment analysis models\n- Audio/speech models via SoundAnalysis integration\n- Any scenario needing Neural Engine optimization\n- Models requiring quantization, palettization, or pruning\n\n### MLX Swift\n\n**When to use:** Running specific open-source LLMs (Llama, Mistral, Qwen, Gemma)\non Apple Silicon with maximum throughput. Research and prototyping.\n\n**Best for:**\n- Highest sustained token generation on Apple Silicon\n- Running Hugging Face models from `mlx-community`\n- Research requiring automatic differentiation\n- Fine-tuning workflows on Mac\n\n### llama.cpp\n\n**When to use:** Cross-platform LLM inference using GGUF model format. Production\ndeployments needing broad device support.\n\n**Best for:**\n- GGUF quantized models (Q4_K_M, Q5_K_M, Q8_0)\n- Cross-platform apps (iOS + Android + desktop)\n- Maximum compatibility with open-source model ecosystem\n\n### Quick Reference\n\n| Scenario | Framework |\n|---|---|\n| Text generation, zero setup (iOS 26+) | Foundation Models |\n| Structured output from on-device LLM | Foundation Models (`@Generable`) |\n| Image classification, object detection | Core ML |\n| Custom model from PyTorch/TensorFlow | Core ML + coremltools |\n| Running specific open-source LLMs | MLX Swift or llama.cpp |\n| Maximum throughput on Apple Silicon | MLX Swift |\n| Cross-platform LLM inference | llama.cpp |\n| OCR and text recognition | Vision framework |\n| Sentiment analysis, NER, tokenization | Natural Language framework |\n| Training custom classifiers on device | Create ML |\n\n## Apple Foundation Models Overview\n\nOn-device language model optimized for Apple Silicon. Available on devices\nsupporting Apple Intelligence (iOS 26+, macOS 26+).\n\n- Token budget covers input + output; check `contextSize` for the limit\n- Check `supportedLanguages` for supported locales\n- Guardrails always enforced, cannot be disabled\n\n### Availability Checking (Required)\n\nAlways check before using. Never crash on unavailability.\n\n```swift\nimport FoundationModels\n\nswitch SystemLanguageModel.default.availability {\ncase .available:\n    // Proceed with model usage\ncase .unavailable(.appleIntelligenceNotEnabled):\n    // Guide user to enable Apple Intelligence in Settings\ncase .unavailable(.modelNotReady):\n    // Model is downloading; show loading state\ncase .unavailable(.deviceNotEligible):\n    // Device cannot run Apple Intelligence; use fallback\ndefault:\n    // Graceful fallback for any other reason\n}\n```\n\n### Session Management\n\n```swift\n// Basic session\nlet session = LanguageModelSession()\n\n// Session with instructions\nlet session = LanguageModelSession {\n    \"You are a helpful cooking assistant.\"\n}\n\n// Session with tools\nlet session = LanguageModelSession(\n    tools: [weatherTool, recipeTool]\n) {\n    \"You are a helpful assistant with access to tools.\"\n}\n```\n\nKey rules:\n- Sessions are stateful -- multi-turn conversations maintain context automatically\n- One request at a time per session (check `session.isResponding`)\n- Call `session.prewarm()` before user interaction for faster first response\n- Save/restore transcripts: `LanguageModelSession(model: model, tools: [], transcript: savedTranscript)`\n\n### Structured Output with @Generable\n\nThe `@Generable` macro creates compile-time schemas for type-safe output:\n\n```swift\n@Generable\nstruct Recipe {\n    @Guide(description: \"The recipe name\")\n    var name: String\n\n    @Guide(description: \"Cooking steps\", .count(3))\n    var steps: [String]\n\n    @Guide(description: \"Prep time in minutes\", .range(1...120))\n    var prepTime: Int\n}\n\nlet response = try await session.respond(\n    to: \"Suggest a quick pasta recipe\",\n    generating: Recipe.self\n)\nprint(response.content.name)\n```\n\n#### @Guide Constraints\n\n| Constraint | Purpose |\n|---|---|\n| `description:` | Natural language hint for generation |\n| `.anyOf([values])` | Restrict to enumerated string values |\n| `.count(n)` | Fixed array length |\n| `.range(min...max)` | Numeric range |\n| `.minimum(n)` / `.maximum(n)` | One-sided numeric bound |\n| `.minimumCount(n)` / `.maximumCount(n)` | Array length bounds |\n| `.constant(value)` | Always returns this value |\n| `.pattern(regex)` | String format enforcement |\n| `.element(guide)` | Guide applied to each array element |\n\nProperties generate in declaration order. Place foundational data before\ndependent data for better results.\n\n### Streaming Structured Output\n\n```swift\nlet stream = session.streamResponse(\n    to: \"Suggest a recipe\",\n    generating: Recipe.self\n)\nfor try await snapshot in stream {\n    // snapshot.content is Recipe.PartiallyGenerated (all properties optional)\n    if let name = snapshot.content.name { updateNameLabel(name) }\n}\n```\n\n### Tool Calling\n\n```swift\nstruct WeatherTool: Tool {\n    let name = \"weather\"\n    let description = \"Get current weather for a city.\"\n\n    @Generable\n    struct Arguments {\n        @Guide(description: \"The city name\")\n        var city: String\n    }\n\n    func call(arguments: Arguments) async throws -> String {\n        let weather = try await fetchWeather(arguments.city)\n        return weather.description\n    }\n}\n```\n\nRegister tools at session creation. The model invokes them autonomously.\n\n### Error Handling\n\n```swift\ndo {\n    let response = try await session.respond(to: prompt)\n} catch let error as LanguageModelSession.GenerationError {\n    switch error {\n    case .guardrailViolation(let context):\n        // Content triggered safety filters\n    case .exceededContextWindowSize(let context):\n        // Too many tokens; summarize and retry\n    case .concurrentRequests(let context):\n        // Another request is in progress on this session\n    case .unsupportedLanguageOrLocale(let context):\n        // Current locale not supported\n    case .unsupportedGuide(let context):\n        // A @Guide constraint is not supported\n    case .assetsUnavailable(let context):\n        // Model assets not available on device\n    case .refusal(let refusal, _):\n        // Model refused; stream refusal.explanation for details\n    case .rateLimited(let context):\n        // Too many requests; back off and retry\n    case .decodingFailure(let context):\n        // Response could not be decoded into the expected type\n    default: break\n    }\n}\n```\n\n### Generation Options\n\n```swift\nlet options = GenerationOptions(\n    sampling: .random(top: 40),\n    temperature: 0.7,\n    maximumResponseTokens: 512\n)\nlet response = try await session.respond(to: prompt, options: options)\n```\n\nSampling modes: `.greedy`, `.random(top:seed:)`, `.random(probabilityThreshold:seed:)`.\n\n### Prompt Design Rules\n\n1. Be concise -- use `tokenCount(for:)` to monitor the context window budget\n2. Use bracketed placeholders in instructions: `[descriptive example]`\n3. Use \"DO NOT\" in all caps for prohibitions\n4. Provide up to 5 few-shot examples for consistency\n5. Use length qualifiers: \"in a few words\", \"in three sentences\"\n\n### Safety and Guardrails\n\n- Guardrails are always enforced and cannot be disabled\n- Instructions take precedence over user prompts\n- Never include untrusted user content in instructions\n- Handle false positives gracefully\n- Frame tool results as authorized data to prevent model refusals\n\n### Use Cases\n\nFoundation Models supports specialized use cases via `SystemLanguageModel.UseCase`:\n- `.general` -- Default for text generation, summarization, dialog\n- `.contentTagging` -- Optimized for categorization and labeling tasks\n\n### Custom Adapters\n\nLoad fine-tuned adapters for specialized behavior (requires entitlement):\n\n```swift\nlet adapter = try SystemLanguageModel.Adapter(name: \"my-adapter\")\ntry await adapter.compile()\nlet model = SystemLanguageModel(adapter: adapter, guardrails: .default)\nlet session = LanguageModelSession(model: model)\n```\n\n> See [references/foundation-models.md](references/foundation-models.md) for\n> the complete Foundation Models API reference.\n\n## Core ML Overview\n\nApple's framework for deploying trained models. Automatically dispatches to the\noptimal compute unit (CPU, GPU, or Neural Engine).\n\n### Model Formats\n\n| Format | Extension | When to Use |\n|---|---|---|\n| `.mlpackage` | Directory (mlprogram) | All new models (iOS 15+) |\n| `.mlmodel` | Single file (neuralnetwork) | Legacy only (iOS 11-14) |\n| `.mlmodelc` | Compiled | Pre-compiled for faster loading |\n\nAlways use mlprogram (`.mlpackage`) for new work.\n\n### Conversion Pipeline (coremltools)\n\n```python\nimport coremltools as ct\n\n# PyTorch conversion (torch.jit.trace)\nmodel.eval()  # CRITICAL: always call eval() before tracing\ntraced = torch.jit.trace(model, example_input)\nmlmodel = ct.convert(\n    traced,\n    inputs=[ct.TensorType(shape=(1, 3, 224, 224), name=\"image\")],\n    minimum_deployment_target=ct.target.iOS18,\n    convert_to='mlprogram',\n)\nmlmodel.save(\"Model.mlpackage\")\n```\n\n### Optimization Techniques\n\n| Technique | Size Reduction | Accuracy Impact | Best Compute Unit |\n|---|---|---|---|\n| INT8 per-channel | ~4x | Low | CPU/GPU |\n| INT4 per-block | ~8x | Medium | GPU |\n| Palettization 4-bit | ~8x | Low-Medium | Neural Engine |\n| W8A8 (weights+activations) | ~4x | Low | ANE (A17 Pro/M4+) |\n| Pruning 75% | ~4x | Medium | CPU/ANE |\n\n### Swift Integration\n\n```swift\nlet config = MLModelConfiguration()\nconfig.computeUnits = .all\nlet model = try MLModel(contentsOf: modelURL, configuration: config)\n\n// Async prediction (iOS 17+)\nlet output = try await model.prediction(from: input)\n```\n\n### MLTensor (iOS 18+)\n\nSwift type for multidimensional array operations:\n\n```swift\nimport CoreML\n\nlet tensor = MLTensor([1.0, 2.0, 3.0, 4.0])\nlet reshaped = tensor.reshaped(to: [2, 2])\nlet result = tensor.softmax()\n```\n\n> See [references/coreml-conversion.md](references/coreml-conversion.md) for the\n> full conversion pipeline and [references/coreml-optimization.md](references/coreml-optimization.md)\n> for optimization techniques.\n\n## MLX Swift Overview\n\nApple's ML framework for Swift. Highest sustained generation throughput on\nApple Silicon via unified memory architecture.\n\n### Loading and Running LLMs\n\n```swift\nimport MLX\nimport MLXLLM\n\nlet config = ModelConfiguration(id: \"mlx-community/Mistral-7B-Instruct-v0.3-4bit\")\nlet model = try await LLMModelFactory.shared.loadContainer(configuration: config)\n\ntry await model.perform { context in\n    let input = try await context.processor.prepare(\n        input: UserInput(prompt: \"Hello\")\n    )\n    let stream = try generate(\n        input: input,\n        parameters: GenerateParameters(temperature: 0.0),\n        context: context\n    )\n    for await part in stream {\n        print(part.chunk ?? \"\", terminator: \"\")\n    }\n}\n```\n\n### Model Selection by Device\n\n| Device | RAM | Recommended Model | RAM Usage |\n|---|---|---|---|\n| iPhone 12-14 | 4-6 GB | SmolLM2-135M or Qwen 2.5 0.5B | ~0.3 GB |\n| iPhone 15 Pro+ | 8 GB | Gemma 3n E4B 4-bit | ~3.5 GB |\n| Mac 8 GB | 8 GB | Llama 3.2 3B 4-bit | ~3 GB |\n| Mac 16 GB+ | 16 GB+ | Mistral 7B 4-bit | ~6 GB |\n\n### Memory Management\n\n1. Never exceed 60% of total RAM on iOS\n2. Set GPU cache limits: `MLX.GPU.set(cacheLimit: 512 * 1024 * 1024)`\n3. Unload models on app backgrounding\n4. Use \"Increased Memory Limit\" entitlement for larger models\n5. Physical device required (no simulator support for Metal GPU)\n\n> See [references/mlx-swift.md](references/mlx-swift.md) for full MLX Swift\n> patterns and llama.cpp integration.\n\n## Multi-Backend Architecture\n\nWhen an app needs multiple AI backends (e.g., Foundation Models + MLX fallback):\n\n```swift\nfunc respond(to prompt: String) async throws -> String {\n    if SystemLanguageModel.default.isAvailable {\n        return try await foundationModelsRespond(prompt)\n    } else if canLoadMLXModel() {\n        return try await mlxRespond(prompt)\n    } else {\n        throw AIError.noBackendAvailable\n    }\n}\n```\n\nSerialize all model access through a coordinator actor to prevent contention:\n\n```swift\nactor ModelCoordinator {\n    func withExclusiveAccess<T>(_ work: () async throws -> T) async rethrows -> T {\n        try await work()\n    }\n}\n```\n\n## Performance Best Practices\n\n1. Run outside debugger for accurate benchmarks (Xcode: Cmd-Opt-R, uncheck\n   \"Debug Executable\")\n2. Call `session.prewarm()` for Foundation Models before user interaction\n3. Pre-compile Core ML models to `.mlmodelc` for faster loading\n4. Use EnumeratedShapes over RangeDim for Neural Engine optimization\n5. Use 4-bit palettization for best Neural Engine memory/latency gains\n6. Batch Vision framework requests in a single `perform()` call\n7. Use async prediction (iOS 17+) in Swift concurrency contexts\n8. Neural Engine (Core ML) is most energy-efficient for compatible operations\n\n## Common Mistakes\n\n1. **No availability check.** Calling `LanguageModelSession()` without checking\n   `SystemLanguageModel.default.availability` crashes on unsupported devices.\n2. **No fallback UI.** Users on pre-iOS 26 or devices without Apple Intelligence\n   see nothing. Always provide a graceful degradation path.\n3. **Exceeding the context window.** The token budget covers input + output.\n   Monitor usage via `tokenCount(for:)` and summarize when needed.\n4. **Concurrent requests on one session.** `LanguageModelSession` supports one\n   request at a time. Check `session.isResponding` or serialize access.\n5. **Untrusted content in instructions.** User input placed in the instructions\n   parameter bypasses guardrail boundaries. Keep user content in the prompt.\n6. **Forgetting `model.eval()` before Core ML tracing.** PyTorch models must be\n   in eval mode before `torch.jit.trace`. Training-mode artifacts corrupt output.\n7. **Using neuralnetwork format.** Always use `mlprogram` (.mlpackage) for new\n   Core ML models. The legacy neuralnetwork format is deprecated.\n8. **Exceeding 60% RAM on iOS (MLX Swift).** Large models cause OOM kills.\n9. **Running MLX in simulator.** MLX requires Metal GPU -- use physical devices.\n10. **Not unloading models on background.** Unload in `scenePhase == .background`.\n\n## Review Checklist\n\n- [ ] Framework selection matches use case and target OS version\n- [ ] Foundation Models: availability checked before every API call\n- [ ] Foundation Models: graceful fallback when model unavailable\n- [ ] Foundation Models: session prewarm called before user interaction\n- [ ] Foundation Models: @Generable properties in logical generation order\n- [ ] Foundation Models: token budget accounted for (check `contextSize`)\n- [ ] Core ML: model format is mlprogram (.mlpackage) for iOS 15+\n- [ ] Core ML: model.eval() called before tracing/exporting PyTorch models\n- [ ] Core ML: minimum_deployment_target set explicitly\n- [ ] Core ML: model accuracy validated after compression\n- [ ] MLX Swift: model size appropriate for target device RAM\n- [ ] MLX Swift: GPU cache limits set, models unloaded on backgrounding\n- [ ] All model access serialized through coordinator actor\n- [ ] Concurrency: model types and tool implementations are `Sendable`-conformant or `@MainActor`-isolated\n- [ ] Physical device testing performed (not simulator)\n\n## References\n\n- [Foundation Models API](references/foundation-models.md) -- LanguageModelSession, @Generable, tool calling, prompt design\n- [Core ML Conversion](references/coreml-conversion.md) -- Model conversion from PyTorch, TensorFlow, other frameworks\n- [Core ML Optimization](references/coreml-optimization.md) -- Quantization, palettization, pruning, performance tuning\n- [MLX Swift & llama.cpp](references/mlx-swift.md) -- MLX Swift patterns, llama.cpp integration, memory management","tags":["apple","device","swift","ios","skills","dpearson2699"],"capabilities":["skill","source-dpearson2699","category-swift-ios-skills"],"categories":["swift-ios-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/dpearson2699/swift-ios-skills/apple-on-device-ai","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"install_from":"skills.sh"}},"qualityScore":"0.300","qualityRationale":"deterministic score 0.30 from registry signals: · indexed on skills.sh · published under dpearson2699/swift-ios-skills","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill:v1","enrichmentVersion":1,"enrichedAt":"2026-04-22T05:40:38.114Z","embedding":null,"createdAt":"2026-04-18T20:33:39.611Z","updatedAt":"2026-04-22T05:40:38.114Z","lastSeenAt":"2026-04-22T05:40:38.114Z","tsv":"'-14':1185,1451 '-6':1453 '/mistral-7b-instruct-v0.3-4bit':1397 '0':336 '0.0':1428 '0.3':1463 '0.5':1461 '0.7':957 '1':654,981,1230,1502,1629,1720 '1.0':1334 '10':1881 '1024':1519,1520 '11':1184 '12':1450 '120':655 '135m':1457 '15':1176,1466,1950 '16':1490,1492 '17':1311,1700 '18':1321 '2':993,1342,1343,1511,1644,1733 '2.0':1335 '2.5':1460 '224':1232,1233 '26':131,133,195,361,450,452,1742 '3':643,1001,1231,1487,1521,1653,1756 '3.0':1336 '3.2':1483 '3.5':1475 '3b':1484 '3n':1471 '4':1010,1271,1452,1473,1485,1496,1527,1665,1676,1776 '4.0':1337 '40':955 '4x':1260,1282,1289 '5':1014,1021,1536,1674,1794 '512':959,1518 '6':1498,1685,1815 '60':1505,1858 '7':1695,1837 '75':1288 '7b':1495 '8':1468,1478,1480,1705,1856 '8x':1267,1273 '9':1869 'a17':1285 'access':568,1603,1793,1994 'account':1937 'accur':1634 'accuraci':187,1251,1969 'across':209 'activ':1281 'actor':1607,1612,1998 'adapt':1095,1100,1108,1114,1121,1122 'adapter.compile':1117 'ai':4,14,1566 'aierror.nobackendavailable':1599 'alway':469,477,719,1037,1194,1214,1750,1841 'analysi':235,417 'android':342 'ane':1284 'anoth':874 'anyof':684 'api':142,1138,1908,2020 'app':171,190,340,1525,1563 'appl':1,16,30,47,52,113,136,211,270,285,400,430,441,447,503,522,1143,1364,1375,1746 'apple-foundation-models-overview':51 'appleintelligencenoten':498 'appli':731 'appropri':1977 'architectur':73,77,1380,1560 'argument':800,811,812 'arguments.city':821 'array':694,714,734,1326 'artifact':1834 'asset':905 'assetsunavail':901 'assist':552,566 'async':813,1308,1579,1617,1620,1697 'audio':208 'audio/speech':237 'augment':165 'author':1064 'automat':297,582,1150 'autonom':833 'avail':443,474,491,907,1722,1904 'await':662,765,819,841,963,1116,1315,1401,1406,1413,1432,1586,1594,1624 'b':1462 'back':927 'backend':72,76,1559,1567 'background':1526,1886,1890,1991 'basic':536 'batch':1686 'behavior':1103 'benchmark':1635 'best':79,83,149,224,278,324,1253,1627,1680 'better':748 'bit':1272,1474,1486,1497,1677 'block':1266 'bound':709,716 'boundari':1808 'bracket':995 'break':945 'broad':321 'budget':454,992,1763,1936 'bypass':1806 'cach':1514,1985 'cachelimit':1517 'call':592,782,810,1215,1645,1694,1724,1909,1921,1954,2025 'canloadmlxmodel':1591 'cannot':471,520,1040 'cap':1007 'case':112,490,496,507,516,852,860,870,882,890,900,910,920,931,1071,1077,1897 'catch':845 'categor':1090 'category-swift-ios-skills' 'caus':1866 'channel':1259 'check':458,463,475,478,590,1723,1727,1789,1905,1939 'checklist':91,94,1892 'citi':797,804,807 'classif':160,227,375 'classifi':233,425 'cmd':1638 'cmd-opt-r':1637 'code':184 'common':85,88,1718 'common-mistak':87 'communiti':294,1396 'compat':345,1716 'compil':618,1187,1190,1656 'compile-tim':617 'complet':1135 'complex':182 'compress':1972 'comput':1155,1254 'concis':983 'concurr':1703,1777,1999 'concurrentrequest':871 'config':1296,1307,1391,1404 'config.computeunits':1298 'configur':1306,1403 'conform':2007 'consist':1020 'constant':717 'constraint':675,676,896 'content':39,161,856,1053,1610,1796,1811 'contentsof':1304 'contenttag':1087 'context':581,855,863,873,885,893,903,923,934,990,1408,1429,1430,1704,1759 'context.processor.prepare':1414 'contexts':459,1940 'convers':579,1201,1210,1353,2030,2033 'convert':213,1241 'cook':551,640 'coordin':1606,1997 'core':33,56,60,197,378,384,1140,1657,1708,1819,1847,1941,1951,1959,1966,2028,2039 'core-ml-overview':59 'coreml':1330 'coremltool':223,386,1203,1206 'corrupt':1835 'could':936 'count':642,691 'cover':29,455,1764 'cpu':1157 'cpu/ane':1291 'cpu/gpu':1262 'crash':482,1729 'creat':428,616 'creation':828 'critic':1213 'cross':310,338,405 'cross-platform':309,337,404 'ct':1208 'ct.convert':1225 'ct.target':1239 'ct.tensortype':1228 'current':793,886 'custom':203,231,380,424,1094 'data':155,743,746,1065 'debug':1642 'debugg':1632 'decis':102 'declar':739 'decod':939 'decodingfailur':932 'default':526,944,1081,1124 'degrad':1754 'depend':745 'deploy':21,202,319,1147,1237,1962 'deprec':1855 'descript':631,639,648,678,791,802,999 'design':979,2027 'desktop':343 'detail':919 'detect':229,377 'devic':3,13,26,134,177,196,322,369,427,436,445,519,909,1442,1443,1538,1732,1744,1880,1980,2012 'devicenotelig':518 'dialog':128,1086 'differenti':298 'directori':1170 'disabl':473,1042 'dispatch':1151 'download':148,512 'dpearson2699':10 'e.g':1568 'e4b':1472 'ecosystem':351 'effici':1714 'element':728,735 'els':1589,1597 'enabl':138,502 'energi':1713 'energy-effici':1712 'enforc':470,727,1038 'engin':246,1161,1278,1672,1682,1707 'entiti':122 'entitl':1105,1532 'enumer':688 'enumeratedshap':1667 'error':834,847,851 'eval':1216,1827 'everi':1907 'exampl':1000,1018,1222 'exceed':1504,1757,1857 'exceededcontextwindows':861 'execut':1643 'expect':942 'explicit':1965 'extens':1165 'extract':123 'face':289 'factual':186 'fallback':525,528,1572,1735,1913 'fals':1057 'faster':598,1192,1663 'fetchweath':820 'few-shot':1015 'file':1179 'filter':859 'fine':300,1098 'fine-tun':299,1097 'first':599 'fix':693 'forget':1816 'format':317,726,1163,1164,1840,1853,1944 'foundat':31,48,53,114,362,371,431,742,1072,1136,1569,1648,1902,1910,1917,1925,1933,2018 'foundationmodel':487 'foundationmodelsrespond':1587 'frame':1060 'framework':40,44,97,108,355,415,422,1145,1367,1688,1893,2038 'framework-selection-rout':43 'full':1352,1550 'func':809,1574,1614 'gain':1684 'gb':1454,1464,1469,1476,1479,1481,1488,1491,1493,1499 'gemma':268,1470 'generabl':157,373,612,614,627,798,1927,2023 'general':1080 'generat':120,151,166,185,283,357,670,683,737,761,946,1084,1372,1422,1931 'generateparamet':1426 'generationopt':951 'get':792 'gguf':315,326 'gpu':1158,1269,1513,1545,1877,1984 'grace':527,1059,1753,1912 'greedi':971 'guarante':174 'guardrail':468,1034,1035,1123,1807 'guardrailviol':853 'guid':18,499,630,638,647,674,729,730,801,895 'handl':835,1056 'hello':1418 'help':550,565 'highest':280,1370 'hint':681 'hug':288 'id':1393 'imag':226,374,1235 'impact':1252 'implement':2004 'import':486,1205,1329,1386,1388 'includ':1050 'increas':1529 'infer':313,408 'input':456,1223,1227,1318,1411,1415,1423,1424,1765,1800 'instruct':543,998,1043,1055,1798,1804 'int':658 'int4':1263 'int8':1256 'integr':241,1293,1556,2056 'intellig':137,448,504,523,1747 'interact':596,1652,1924 'invok':831 'io':6,130,194,341,360,449,1175,1183,1310,1320,1510,1699,1741,1861,1949 'ios18':1240 'iphon':1449,1465 'isol':2010 'k':330,333 'keep':1809 'key':143,571 'kill':1868 'label':1092 'languag':421,437,680 'languagemodelsess':540,546,558,603,1127,1725,1782,2022 'languagemodelsession.generationerror':849 'larg':1864 'larger':1534 'learn':221 'legaci':1181,1851 'length':695,715,1023 'let':538,544,556,659,754,776,787,790,816,838,846,854,862,872,884,892,902,912,922,933,949,960,1107,1118,1125,1295,1300,1312,1331,1338,1344,1390,1398,1410,1419 'limit':462,1515,1531,1986 'llama':265,1482 'llama.cpp':38,305,396,409,1555,2050,2055 'llm':312,370,407 'llmmodelfactory.shared.loadcontainer':1402 'llms':264,392,1384 'load':514,1096,1193,1381,1664 'local':467,887 'logic':1930 'low':1261,1275,1283 'low-medium':1274 'm':331,334 'mac':304,1477,1489 'maco':132,451 'macro':615 'mainactor':2009 'maintain':580 'manag':534,1501,2058 'mani':865,925 'match':1895 'math':183 'max':698 'maximum':273,344,397,703 'maximumcount':712 'maximumresponsetoken':958 'medium':1268,1276,1290 'memori':1379,1500,1530,2057 'memory/latency':1683 'metal':1544,1876 'min':697 'minimum':701,1236,1961 'minimumcount':710 'minut':652 'mistak':86,89,1719 'mistral':266,1494 'ml':27,34,57,61,198,379,385,429,1141,1366,1658,1709,1820,1848,1942,1952,1960,1967,2029,2040 'mlmodel':1177,1224,1303 'mlmodel.save':1244 'mlmodelc':1186,1661 'mlmodelconfigur':1297 'mlpackag':1169,1197,1844,1947 'mlprogram':1171,1196,1243,1843,1946 'mltensor':1319,1333 'mlx':35,63,67,254,293,393,402,1361,1387,1395,1551,1571,1862,1871,1874,1973,1982,2048,2052 'mlx-commun':292,1394 'mlx-swift-overview':66 'mlx.gpu.set':1516 'mlxllm':1389 'mlxrespond':1595 'mode':970,1828,1833 'model':28,32,49,54,115,147,205,214,236,238,248,290,316,328,350,363,372,381,432,438,494,510,604,605,830,904,914,1068,1073,1119,1128,1129,1137,1149,1162,1174,1221,1301,1399,1439,1446,1523,1535,1570,1602,1649,1659,1823,1849,1865,1884,1903,1911,1915,1918,1926,1934,1943,1958,1968,1975,1988,1993,2000,2019,2032 'model.eval':1212,1817,1953 'model.mlpackage':1245 'model.perform':1407 'model.prediction':1316 'modelconfigur':1392 'modelcoordin':1613 'modelnotreadi':509 'modelurl':1305 'monitor':988,1767 'multi':71,75,577,1558 'multi-backend':70,1557 'multi-backend-architectur':74 'multi-turn':576 'multidimension':1325 'multipl':1565 'must':1824 'my-adapt':1112 'n':692,702,704,711,713 'name':634,636,777,780,788,805,1111,1234 'natur':420,679 'need':173,244,320,1564,1775 'ner':418 'network':145 'neural':245,1160,1277,1671,1681,1706 'neuralnetwork':1180,1839,1852 'never':481,1049,1503 'new':1173,1199,1846 'nlp':207,232 'noth':1749 'numer':699,708 'object':228,376 'ocr':410 'on-devic':11,24,175,367,434 'one':583,706,1780,1784 'one-sid':705 'oom':1867 'open':262,348,390 'open-sourc':261,347,389 'oper':1327,1717 'opt':1639 'optim':23,247,439,1088,1154,1246,1359,1673,2041 'option':774,947,950,967,968 'order':740,1932 'os':1900 'output':125,365,457,610,625,752,1313,1766,1836 'outsid':1631 'overview':50,55,58,62,65,69,433,1142,1363 'palett':251,1270,1678,2044 'paramet':1425,1805 'part':1433 'part.chunk':1437 'pasta':668 'path':1755 'pattern':723,1553,2054 'per':588,1258,1265 'per-block':1264 'per-channel':1257 'perform':78,82,1626,1693,2014,2046 'performance-best-practic':81 'physic':1537,1879,2011 'pick':105 'pipelin':1202,1354 'place':741,1801 'placehold':996 'platform':17,212,311,339,406 'posit':1058 'practic':80,84,1628 'pre':193,1189,1655,1740 'pre-compil':1188,1654 'pre-io':192,1739 'preced':1045 'predict':1309,1698 'prep':649 'preptim':657 'prevent':1067,1609 'prewarm':1920 'print':672,1436 'privaci':178 'pro':1467 'pro/m4':1286 'probabilitythreshold':976 'proceed':492 'product':318 'progress':878 'prohibit':1009 'prompt':844,966,978,1048,1417,1577,1588,1596,1814,2026 'properti':736,773,1928 'protocol':170 'prototyp':277 'provid':1011,1751 'prune':253,1287,2045 'purpos':677 'python':1204 'pytorch':216,1209,1822,1957,2035 'pytorch/tensorflow':383 'q4':329 'q5':332 'q8':335 'qualifi':1024 'quantiz':250,327,2043 'quick':352,667 'qwen':267,1459 'r':1640 'ram':1444,1447,1508,1859,1981 'random':953,972,975 'rang':653,696,700 'rangedim':1669 'ratelimit':921 'reason':532 'recip':629,633,669,760 'recipe.partiallygenerated':771 'recipe.self':671,762 'recipetool':561 'recognit':413 'recommend':1445 'reduct':1250 'refer':95,96,353,1139,2017 'references/coreml-conversion.md':1348,1349,2031 'references/coreml-optimization.md':1356,1357,2042 'references/foundation-models.md':1131,1132,2021 'references/mlx-swift.md':1547,1548,2051 'refus':911,913,915,1069 'refusal.explanation':917 'regex':724 'regist':824 'request':584,875,926,1689,1778,1785 'requir':249,296,476,1104,1539,1875 'research':275,295 'reshap':1339 'respond':1575 'respons':600,660,839,935,961 'response.content.name':673 'restrict':686 'result':749,1062,1345 'rethrow':1621 'retri':869,930 'return':720,822,1584,1592 'review':90,93,1891 'review-checklist':92 'right':107 'router':42,46,99 'rule':572,980 'run':259,287,387,521,1383,1630,1870 'safe':624 'safeti':858,1032 'sampl':952,969 'save/restore':601 'savedtranscript':608 'scenario':243,354 'scenephas':1889 'schema':620 'scikit':220 'scikit-learn':219 'see':1130,1347,1546,1748 'seed':974,977 'segment':230 'select':20,41,45,98,1440,1894 'sendabl':2006 'sentenc':1031 'sentiment':234,416 'serial':1600,1792,1995 'session':533,537,539,541,545,553,557,573,589,827,881,1126,1781,1919 'session.isresponding':591,1790 'session.prewarm':593,1646 'session.respond':663,842,964 'session.streamresponse':756 'set':506,1512,1964,1987 'setup':140,359 'shape':1229 'short':127 'shot':1017 'show':513 'side':707 'silicon':271,286,401,442,1376 'simul':1541,1873,2016 'singl':1178,1692 'size':1249,1976 'skill':7,8 'smollm2':1456 'smollm2-135m':1455 'snapshot':766 'snapshot.content':769 'snapshot.content.name':778 'soundanalysi':240 'sourc':263,349,391 'source-dpearson2699' 'special':1075,1102 'specif':260,388 'state':515,575 'step':641,645 'stream':750,755,768,916,1420,1435 'string':637,646,689,725,808,815,1578,1581 'struct':628,784,799 'structur':124,154,364,609,751 'suggest':665,758 'suit':180 'summar':121,159,867,1085,1773 'support':323,446,466,889,899,1074,1542,1783 'supportedlanguag':464 'sustain':281,1371 'swift':5,36,64,68,255,394,403,485,535,626,753,783,836,948,1106,1292,1294,1322,1328,1362,1369,1385,1552,1573,1611,1702,1863,1974,1983,2049,2053 'switch':488,850 'systemlanguagemodel':1120 'systemlanguagemodel.adapter':1110 'systemlanguagemodel.default.availability':489,1728 'systemlanguagemodel.default.isavailable':1583 'systemlanguagemodel.usecase':1079 'tag':162 'take':1044 'target':191,1238,1899,1963,1979 'task':188,1093 'techniqu':1247,1248,1360 'temperatur':956,1427 'tensor':1332 'tensor.reshaped':1340 'tensor.softmax':1346 'tensorflow':217,2036 'termin':1438 'test':2013 'text':119,152,356,412,1083 'three':1030 'throughput':274,398,1373 'throw':814,1580,1598,1618 'time':587,619,650,1788 'token':282,419,453,866,1762,1935 'tokencount':985,1770 'tool':164,169,555,559,570,606,781,786,825,1061,2003,2024 'tool-aug':163 'top':954,973 'torch.jit.trace':1211,1220,1830 'total':1507 'trace':1218,1219,1226,1821 'tracing/exporting':1956 'train':204,423,1148,1832 'training-mod':1831 'transcript':602,607 'tree':103 'tri':661,764,818,840,962,1109,1115,1302,1314,1400,1405,1412,1421,1585,1593,1623 'trigger':857 'tune':301,1099,2047 'turn':578 'type':158,623,943,1323,2001 'type-saf':622 'ui':1736 'unavail':484,497,508,517,1916 'uncheck':1641 'unifi':1378 'unit':1156,1255 'unload':1522,1883,1887,1989 'unsupport':1731 'unsupportedguid':891 'unsupportedlanguageorlocal':883 'untrust':1051,1795 'updatenamelabel':779 'usag':495,1448,1768 'use':100,111,118,201,258,308,314,480,524,984,994,1002,1022,1070,1076,1168,1195,1528,1666,1675,1696,1838,1842,1878,1896 'user':500,595,1047,1052,1651,1737,1799,1810,1923 'userinput':1416 'valid':1970 'valu':685,690,718,722 'var':635,644,656,806 'version':1901 'via':239,1078,1377,1769 'vision':206,414,1687 'w8a8':1279 'weather':789,794,817 'weather.description':823 'weathertool':560,785 'weight':1280 'window':991,1760 'withexclusiveaccess':1615 'without':1726,1745 'word':1028 'work':1200,1616,1625 'workflow':302 'xcode':1636 'zero':139,358","prices":[{"id":"252f9f00-7b4c-4a86-bd9c-cf75baac0d86","listingId":"c92760ee-e31f-4996-8157-664c60fa0a58","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"dpearson2699","category":"swift-ios-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T20:33:39.611Z"}],"sources":[{"listingId":"c92760ee-e31f-4996-8157-664c60fa0a58","source":"github","sourceId":"dpearson2699/swift-ios-skills/apple-on-device-ai","sourceUrl":"https://github.com/dpearson2699/swift-ios-skills/tree/main/skills/apple-on-device-ai","isPrimary":false,"firstSeenAt":"2026-04-18T22:00:43.464Z","lastSeenAt":"2026-04-22T00:53:41.398Z"},{"listingId":"c92760ee-e31f-4996-8157-664c60fa0a58","source":"skills_sh","sourceId":"dpearson2699/swift-ios-skills/apple-on-device-ai","sourceUrl":"https://skills.sh/dpearson2699/swift-ios-skills/apple-on-device-ai","isPrimary":true,"firstSeenAt":"2026-04-18T20:33:39.611Z","lastSeenAt":"2026-04-22T05:40:38.114Z"}],"details":{"listingId":"c92760ee-e31f-4996-8157-664c60fa0a58","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"dpearson2699","slug":"apple-on-device-ai","source":"skills_sh","category":"swift-ios-skills","skills_sh_url":"https://skills.sh/dpearson2699/swift-ios-skills/apple-on-device-ai"},"updatedAt":"2026-04-22T05:40:38.114Z"}}