Llama.cpp C++-to-Csharp wrapper is a minor extension to Llama.cpp tag b3490 codebase modified a bit by testedlines allowing it to be compiled for and called from Styled Lines Csharp unity asset store package.

License

Llama.cpp C++-to-Csharp wrapper from testedlines.com distributed under Standard Unity Asset Store EULA (see: https://unity.com/legal/as-terms)

Building

This document provides step-by-step instructions to build and deploy the "Llama.cpp C++-to-Csharp wrapper from testedlines.com" across various platforms, including Windows, Android, MacOS, iOS, and WebGL. Each section details the necessary build commands and subsequent actions required to properly configure and deploy the build artifacts.

Pre-build Steps

1. Generate C++ Interface Wrapper for CSharp

Before proceeding with platform-specific builds, we need to generate a C++ interface wrapper that will allow C# integration. To do this, navigate to the cpp-sources/wrapper/infrence-lib directory and execute the following Docker command:

docker run -it -v "$(pwd)/../../:/llama" --entrypoint /bin/bash olejak/swig-monoaot -c 'cd /llama/wrapper/infrence-lib && /usr/local/bin/swig -c++ -csharp -monoaotcompat -namespace LlamaLibrary -outdir ../dotnet/LlamaCppInfrence -o LlamaWrapper.cxx -I"../../" -I"../../common/" -I"./" lib.i'

This command generates the necessary C++ wrapper and outputs the files to the dotnet/LlamaCppInfrence directory. These files will be used later in the Unity3D C# wrapper build process. You can build that docker image using Docker file avaliable in Docker-swig folder.

Windows-x64 Build

Please use PoweShell for calling docker commands on Windows platforms as they allow use of getting current directory "$(pwd)" in the manner compatible with MacOS.

1. Building for Windows-x64

To build the project for Windows-x64, use the following Docker command from PoweShell:

docker run -it -v "$(pwd)/../../:/llama" dockcross/windows-static-x64 /bin/bash -c "cd /llama && rm -rf ./build-win ; mkdir build-win && cd build-win && cmake -DBUILD_SHARED_LIBS=OFF .. && make -j llama_lib"

Expected Artifacts:

libllama_lib.dll: This file will be generated in the build-win/bin directory.
Action: Copy libllama_lib.dll into the following locations:
- wrapper/dotnet/LlamaCppDemo
- Assets/StyledLines/Plugins/Windows or

Android Build

1. ARM64 (arm64-v8a)

To build for ARM64 (arm64-v8a), execute:

docker run -it -v "$(pwd)/../../:/llama" dockcross/android-arm64 /bin/bash -c "cd /llama && rm -rf ./build-arm64-v8a ; mkdir build-arm64-v8a && cd build-arm64-v8a && cmake -DBUILD_SHARED_LIBS=OFF -DANDROID_PLATFORM=android-22 -DCMAKE_C_FLAGS='-march=armv8.4a+dotprod -fPIC -flto -fopenmp -static-openmp -static-libgcc' -DCMAKE_CXX_FLAGS='-fPIC -flto -fopenmp -static-openmp -static-libstdc++ -static-libgcc' -DANDROID_ABI=arm64-v8a -DCMAKE_BUILD_TYPE=Release .. && make -j llama_lib"

2. ARMv7 (armeabi-v7a)

For ARMv7 (armeabi-v7a) builds, use:

docker run -it -v "$(pwd)/../../:/llama" dockcross/android-arm /bin/bash -c "cd /llama && rm -rf ./build-armeabi-v7a ; mkdir build-armeabi-v7a && cd build-armeabi-v7a && cmake -DBUILD_SHARED_LIBS=OFF -DANDROID_PLATFORM=android-22 -DCMAKE_C_FLAGS='-march=armv7-a -m32 -fPIC -flto -fopenmp -static-openmp -static-libgcc' -DCMAKE_CXX_FLAGS='-m32 -fPIC -flto -fopenmp -static-openmp -static-libstdc++ -static-libgcc' -DANDROID_ABI=armeabi-v7a -DCMAKE_BUILD_TYPE=Release .. && make -j llama_lib"

3. x86

To build for Android x86:

docker run -it -v "$(pwd)/../../:/llama" dockcross/android-x86 /bin/bash -c "cd /llama && rm -rf ./build-x86 ; mkdir build-x86 && cd build-x86 && cmake -DBUILD_SHARED_LIBS=OFF -DANDROID_PLATFORM=android-22 -DCMAKE_C_FLAGS='-march=i686 -m32 -fPIC -flto -fopenmp -static-openmp -static-libgcc' -DCMAKE_CXX_FLAGS='-m32 -fPIC -flto -fopenmp -static-openmp -static-libstdc++ -static-libgcc' -DANDROID_ABI=x86 -DCMAKE_BUILD_TYPE=Release .. && make -j llama_lib"

4. x86_64

For x86_64 builds:

docker run -it -v "$(pwd)/../../:/llama" dockcross/android-x86_64 /bin/bash -c "cd /llama && rm -rf ./build-x86_64 ; mkdir build-x86_64 && cd build-x86_64 && cmake -DBUILD_SHARED_LIBS=OFF -DANDROID_PLATFORM=android-22 -DCMAKE_C_FLAGS='-march=x86-64 -fPIC -flto -fopenmp -static-openmp -static-libgcc' -DCMAKE_CXX_FLAGS='-fPIC -flto -fopenmp -static-openmp -static-libstdc++ -static-libgcc' -DANDROID_ABI=x86_64 -DCMAKE_BUILD_TYPE=Release .. && make -j llama_lib"

Expected Artifacts for Android Builds:

For each build configuration, the libllama_lib.a will be generated in the corresponding build-<ABI> directory.

Action: Copy the artifacts from the wrapper/infrence-lib build directories of each build into your Unity project for the respective Android platform like Assets/StyledLines/Plugins/Android/libs.

MacOS and iOS Build

Prerequisites

Before building for iOS or MacOS, ensure you have Xcode and CMake installed:

brew install cmake

xcode-select --install

1. Building for iOS

To build for iOS, follow these steps:

mkdir build-ios
git clone --recursive https://github.com/leetal/ios-cmake
cmake -G Xcode -DCMAKE_TOOLCHAIN_FILE=$(pwd)/ios-cmake/ios.toolchain.cmake -DGGML_METAL=OFF -DPLATFORM=OS64COMBINED -DLLAMA_BUILD_EXAMPLES=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_SERVER=OFF -DBUILD_LLAMA_WRAPPER_DEMO=OFF -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF ..
xcodebuild -target llama_lib -configuration Release

2. Building for MacOS

For MacOS, execute:

mkdir build-mac
git clone --recursive https://github.com/leetal/ios-cmake
cmake -G Xcode -DCMAKE_TOOLCHAIN_FILE=$(pwd)/ios-cmake/ios.toolchain.cmake -DGGML_METAL=OFF -DPLATFORM=MAC_UNIVERSAL -DLLAMA_BUILD_EXAMPLES=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_SERVER=OFF -DBUILD_LLAMA_WRAPPER_DEMO=OFF -DCMAKE_BUILD_TYPE=Release ..
xcodebuild -target llama_lib -configuration Release

Expected Artifacts:

llama_lib.bundle: After building, you should have the bundle library file (that may show up as a directory on windows) llama_lib.bundle in the build directory, which can be integrated into iOS or MacOS applications by copiing it into Assets/StyledLines/Plugins/Mac.

WebGL Build

To build for WebGL, execute the following Docker command:

docker run -it -v "$(pwd)/../../:/llama" emscripten/emsdk:3.1.8 /bin/bash -c "cd /llama && rm -rf ./build-web ; mkdir build-web && cd build-web && emcmake cmake -DBUILD_SHARED_LIBS=OFF -DEMSCRIPTEN=ON .. && export EMCC_CFLAGS='-pthread -O3 -msimd128 -fno-rtti -DNDEBUG -flto=full -fpic -s BUILD_AS_WORKER=0 -s EXPORT_ALL=1 -s EXPORT_ES6=1 -s MODULARIZE=1 -s INITIAL_MEMORY=128MB -s MAXIMUM_MEMORY=2GB -s ALLOW_MEMORY_GROWTH=1 -s FORCE_FILESYSTEM=1 -s NO_EXIT_RUNTIME=1' && emmake make -j llama_lib"

Expected Artifacts:

From the build-web folder, copy the following files:
- common/libcommon.a
- ggml/src/libggml.a
- src/libllama.a
- wrapper/infrence-lib/libllama_lib.a
Action: Place these files into the Assets/StyledLines/Plugins/WebGL/ directory in your Unity project.

Note: The WebGL build process may take approximately 25 minutes in optimized mode, Code optimisation with Runtime Speed with LTO on project build is recommended.

Building Llama.cpp C++-to-Csharp wrapper for Unity3D

After the Llama.cpp builds are complete, we proceed to build the C# wrapper for Unity3D. You have two options:

Option A: Automated Build

Run the following Docker command:

docker run --rm -v "$(pwd)/../:/wrapper" -v "$(pwd)/../autofix_csharp_on_build.sh:/automate.sh" ubuntu:20.04 /bin/bash /automate.sh

Option B: Manual Modifications

In the file containing the class libllama_libPINVOKE, replace "libllama_lib" with LIBRARY_NAME.
Add the following code after the line class libllama_libPINVOKE {:

#if UNITY_STANDALONE_OSX || UNITY_EDITOR_OSX
        public const string LIBRARY_NAME = "llama_lib.bundle";
#elif UNITY_IOS && !UNITY_EDITOR
    public const string LIBRARY_NAME = "__Internal";
#elif UNITY_ANDROID && !UNITY_EDITOR
    public const string LIBRARY_NAME = "libllama_lib.so";
#elif UNITY_WEBGL && !UNITY_EDITOR
    private const string LIBRARY_NAME = "__Internal";
#else
        public const string LIBRARY_NAME = "libllama_lib"; // Default name for other platforms
#endif

In the file containing public class LlamaInfrence : global::System.IDisposable {, set the constructor to public LlamaInfrence(global::System.IntPtr cPtr, bool cMemoryOwn).
In the file containing public class LoggingContext : global::System.IDisposable {, set public global::System.Runtime.InteropServices.HandleRef swigCPtr; to public.

Final Steps

Copy the files from wrapper/dotnet/LlamaCppInfrence into Assets/StyledLines/Runtime/LlamaInfrence.
Copy CallbackWrapper.cs from wrapper/dotnet/LlamaCppDemo into Assets/StyledLines/Runtime/LlamaInfrence.