Just a stranger trying things.

  • 4 Posts
  • 92 Comments
Joined 1 year ago
cake
Cake day: July 16th, 2023

help-circle


  • I didn’t say it can’t. But I’m not sure how well it is optimized for it. From my initial testing it queues queries and submits them one after another to the model, I have not seen it batch compute the queries, but maybe it’s a setup thing on my side. vLLM on the other hand is designed specifically for the multi co current user use case and has multiple optimizations for it.


  • The Hobbyist@lemmy.ziptoSelfhosted@lemmy.worldSelf-hosting LLMs
    link
    fedilink
    English
    arrow-up
    22
    ·
    edit-2
    4 days ago

    I run the Mistral-Nemo(12B) and Mistral-Small (22B) on my GPU and they are pretty code. As others have said, the GPU memory is one of the most limiting factors. 8B models are decent, 15-25B models are good and 70B+ models are excellent (solely based on my own experience). Go for q4_K models, as they will run many times faster than higher quantization with little performance degradation. They typically come in S (Small), M (Medium) and (Large) and take the largest which fits in your GPU memory. If you go below q4, you may see more severe and noticeable performance degradation.

    If you need to serve only one user at the time, ollama +Webui works great. If you need multiple users at the same time, check out vLLM.

    Edit: I’m simplifying it very much, but hopefully should it is simple and actionable as a starting point. I’ve also seen great stuff from Gemma2-27B

    Edit2: added links

    Edit3: a decent GPU regarding bang for buck IMO is the RTX 3060 with 12GB. It may be available on the used market for a decent price and offers a good amount of VRAM and GPU performance for the cost. I would like to propose AMD GPUs as they offer much more GPU mem for their price but they are not all as supported with ROCm and I’m not sure about the compatibility for these tools, so perhaps others can chime in.

    Edit4: you can also use openwebui with vscode with the continue.dev extension such that you can have a copilot type LLM in your editor.










  • I understand your position. There is a learning curve to containers, but I can assure you that getting your basics on the topic will open a whole new world of possibilities and also make everything much easier for yourself. The vast majority of people run containers which make the services less brittle because they have their own tailored environment and don’t depend on the host libraries and packages and also brings increased security because the services can’t easily escape their boundaries rendering their potential vulnerabilities less of an issue compared to running those same services bare metal.

    I started on synology too. There is a website called Marius hosting which focuses on tutorials for containers on synology, but his instructions have been updated the last few years to focus on spinning up containers manually rather than through the UI, which makes it more intimidating than it needs to be for beginners… I’ll link it here just as a reference. I’ll see if on the way back machine he shows the easier way and report back if I find something.

    Edit: yes here is an original tutorial for Jellyfin (this method still works for me and is still how I use docker lately): https://web.archive.org/web/20210305002024/https://mariushosting.com/how-to-install-jellyfin-on-your-synology-nas/


  • To answer your question more specifically, most people set up the pi with docker, using services which have a front end accessible in the browser. They basically use their browser to navigate to the front end of the service they want to use and administer it like that. For instance portainer to manage their docker containers, or pihole for managing their firewall, or even jellyfin for their media which is both the website to consume the media and has an administrator dashboard.

    Edit: this is in complement to using something like tailscale which basically allows you to access these services away from home. They work in conjunction.




  • From what I understand, bsky’s architecture seems to allow federation at multiple levels. On one side the individual profiles are actually websites and the app aggregates the content almost as an RSS reader. I do see some profiles which are independent like Jeff Gierling’s, so yes federation at the profile level seems to work.

    And this is really important because it is one way to prevent your data from being hostage by the service. Then there is another level of federation. I’m not entirely sure of the terminology here, but there is one aggregator aspect, which is quite compute intensive. And that one I don’t know if there is another instance of it. But functionally speaking, I’m quite impressed by the technical aspect of bsky. There has been a lot of thought put into it.

    And monetizing it is not the issue, the problem is mostly how. That they have some paid features is fine, it’s even important that there are ways to monetize it without milking their users of their privacy.

    Let’s hope this works out and becomes sustainable while respecting the users!


  • I’m very grateful for your extended help. I’ve made some progress. I’m able to get the prompt to appear asking me for my passphrase to unlock the right partition (sda3 in my case). Entering the passphrase, however, drops me in the Dracut emergency shell after ~3min of dracut logs, seemingly looping. (Edit: the reason for why it drops me in the shell is very unclear. It says Dropping to debug shell. /bin/sh: can't access tty: job control turned off. And if I try to exit the dracut shell, it says dracut Warning: could not boot.).

    In the Dracut emergency shell, checking /dev/mapper/ I see a luks-<sda3-uuid> listed. Running blkid I see it listed too with TYPE=crypto_LUKS. I also see a dev/dm-0 with a dedicated UUID, in ext4. I ran blkid which shows:

    /dev/mapper/luks-705fc477-573a-4ef6-81b6-a14c43cda1f5: UUID="57955343-922a-4918-9bc1-797ca8d13a9c" TYPE="ext4"
    /dev/sda1: UUID="cc5e0b03-3544-4bef-ab8b-8b72dd236926" TYPE="ext4"
    /dev/sda2: UUID="4df1af6c-3199-4bb2-bb12-bcf897cfc6fc" TYPE="swap"
    /dev/sda3: UUID="705fc477-573a-4ef6-81b6-a14c43cda1f5" TYPE="crypto_LUKS"
    /dev/dm-0: UUID="57955343-922a-4918-9bc1-797ca8d13a9c" TYPE="ext4"
    

    I checked the status of the filesystem running cryptsetup status /dev/mapper/luks-<sda3-uuid> and it says it is active, which I guess means it is unlocked?

    I checked the /root directory, and it is empty. So I tried to mount the partition myself: mount /dev/mapper/luks-<sda3-uuid> /root but it fails saying mount: mounting /dev/mapper/luks-<sda3-uuid> on /root failed: No such file or directory and that got me really puzzled? I’ve been searching far and wide but I can’t seem to find anyone with a similar situation. I feel like I’m close to getting this working.

    Below is my syslinux kernel config, and the 2nd and 3rd items are what I booted into (/boot/extlinux.conf)

    # Generated by update-extlinux 6.04_pre1-r15
    DEFAULT menu.c32
    PROMPT 0
    MENU TITLE Alpine/Linux Boot Menu
    MENU HIDDEN
    MENU AUTOBOOT Alpine will be booted automatically in # seconds.
    TIMEOUT 10
    LABEL lts
      MENU DEFAULT
      MENU LABEL Linux lts
      LINUX vmlinuz-lts
      INITRD initramfs-lts
      APPEND root=/dev/mapper/root modules=sd-mod,usb-storage,ext4 cryptroot=UUID=705fc477-573a-4ef6-81b6-a14c43cda1f5 cryptdm=root rootfstype=ext4 rd.debug log_buf_len=1M rd.shell
    
    LABEL lts
      MENU DEFAULT
      MENU LABEL Dracut Linux lts
      LINUX vmlinuz-lts
      INITRD /boot/initramfs-6.6.56-0-lts.img
      APPEND root=/dev/mapper/luks-705fc477-573a-4ef6-81b6-a14c43cda1f5 modules=sd-mod,usb-storage,ext4 rootfstype=ext4 rd.shell rd.debug log_buf_len=1M rd.luks.uuid=705fc477-573a-4ef6-81b6-a14c43cda1f5
    
    LABEL lts
      MENU DEFAULT
      MENU LABEL Dracut Linux lts 2
      LINUX vmlinuz-lts
      INITRD /boot/initramfs-6.6.56-0-lts.img
      APPEND modules=sd-mod,usb-storage,ext4,dm,crypt,rootfs-block rootfstype=ext4 rootflags=rw,relatime rd.shell rd.debug log_buf_len=1M root=UUID=57955343-922a-4918-9bc1-797ca8d13a9c rd.luks.uuid=705fc477-573a-4ef6-81b6-a14c43cda1f5
    

    And here the /proc/cmdline of the booted partition:

    BOOT_IMAGE=vmlinuz-lts modules=sd-mod,usb-storage,ext4,dm,crypt,rootfs-block rootfstype=ext4 rootflags=rw,relatime rd.shell rd.debug log_buf_len=1M root=UUID=57955343-922a-4918-9bc1-797ca8d13a9c rd.luks.uuid=705fc477-573a-4ef6-81b6-a14c43cda1f5 initrd=/boot/initramfs-6.6.56-0-lts.img
    

    Here is my setup, when I boot in my regular initramfs (the one I’m trying to replicate using dracut):

    mytestalpine:~# lsblk -o NAME,FSTYPE,FSVER,LABEL,UUID,FSAVAIL,FSUSE%,MOUNTPOINTS
    NAME     FSTYPE      FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
    sda                                                                                  
    ├─sda1   ext4                    cc5e0b03-3544-4bef-ab8b-8b72dd236926  195.5M    21% /boot
    ├─sda2   swap                    4df1af6c-3199-4bb2-bb12-bcf897cfc6fc                [SWAP]
    └─sda3   crypto_LUKS             705fc477-573a-4ef6-81b6-a14c43cda1f5                
      └─root ext4                    57955343-922a-4918-9bc1-797ca8d13a9c    2.3G     8% /
    
    mytestalpine:~# lsblk -l -n /dev/sda3
    sda3   8:3   0  2.8G 0 part  
    root 253:0   0  2.8G 0 crypt /
    

    Note: No idea of the relevance, but I’m testing this setup in a VM, with a BIOS firmware.


  • I’m following bsky’s progress. They have shared things which on a technical standpoint and from a social network empowerment perspective are very interesting. The portability of the profiles and the fully custom moderation layers are particularly noteworthy and seem to go far beyond what I’ve seen in other social networks. Even in mastodon apparently it is not possible to port a profile from one instance to another without losing all your post history (ente.io tried this recently and got caught by this). And for moderation, you have to rely on your instance moderation rather than personalized one. And the annotation part of bsky is also interesting to me.