Archive | May, 2024

Easily Upgrade Your Windows 365 Cloud PC Licenses with Step-up Licensing

3 May

Are you a Windows 365 Enterprise admin looking to upgrade your users to a higher configuration license without the full cost of two separate licenses? Step-up licensing makes this process simple and cost-effective.

What are Step-up Licenses?

Step-up licenses allow admins with a direct Enterprise Agreement to migrate users from a lower-configuration Windows 365 license to a higher-configuration one. These are available for compute (RAM/CPU) and storage upgrades.

The great thing about step-up licenses is you only pay the difference in cost between the lower and higher tier licenses, rather than the full price of two standalone licenses. This can mean significant savings when you need to upgrade multiple users.

How to Resize Cloud PCs with Step-up Licenses

Let’s walk through an example. Say you purchased step-up licenses to upgrade from Windows 365 Enterprise 2vCPU/4GB/128GB to 4vCPU/16GB/128GB. Here’s how to bulk resize the Cloud PCs to the new higher spec while preserving user data:

  1. In the Microsoft Admin Center, go to the “Your Products” page. You’ll see your new stepped-up 4vCPU licenses added and an equal number of 2vCPU licenses removed.
  2. Follow the bulk resize process, selecting the 2vCPU as the base license and 4vCPU as the target license. This will migrate the users and their data to the higher spec Cloud PCs.
  1. Important: You have 90 days to complete the migration before users lose access to their old 2vCPU Cloud PCs. So don’t wait too long!

Key Takeaways

  • Step-up licenses make it easy and affordable to upgrade to higher configs
  • You can bulk resize to migrate users while keeping their data intact
  • You have 90 days to complete the switch before the old licenses expire

Reference Microsoft LinkResize a Cloud PC | Microsoft Learn

Hopefully this helps clarify how to take advantage of step-up licensing to give your Windows 365 users an upgraded experience. Please let me know if I’ve missed any steps or details, and I’ll be happy to update the post.

Thanks,
Aresh Sarkari

Exploring Uncensored LLM Model – Dolphin 2.9 on Llama-3-8b

2 May

I’ve been diving deep into the world of Large Language Models (LLMs) like ChatGPT, Gemini, Claude, and LLAMA. But recently, I stumbled upon something that completely blew my mind: uncensored LLMs! 🤯

As someone who loves pushing the boundaries of AI and exploring new frontiers, I couldn’t resist the temptation to try out an uncensored LLM for myself. And let me tell you, the experience was nothing short of mind-blowing! 🎆 After setting up and running an uncensored LLM locally for the first time, I was amazed by the raw, unfiltered outputs it generated. It gave me a whole new perspective on the potential of such LLMs and why having an uncensored variant is so important for certain perspectives and society in general.

In this blog post, I’ll be sharing my journey with uncensored LLMs, diving into the nitty-gritty details of what they are, how they differ from regular LLMs, and why they exist. I’ll also be sharing my hands-on experience with setting up and running an uncensored LLM locally, so you can try it out for yourself! 💻

🤖 Introduction: Uncensored LLM vs Regular LLM

Large Language Models (LLMs) are AI systems trained on vast amounts of text data to understand and generate human-like text based on input prompts. There are two main types of LLMs: regular and uncensored.

Regular LLMs, such as those created by major organizations like OpenAI, Anthropic, Google, etc. are designed with specific safety and ethical guidelines, often reflecting societal norms and legal standards. These models avoid generating harmful or inappropriate content. (Click on each link to read their AI Principles)

Uncensored LLMs, on the other hand, are models that do not have these built-in restrictions. They are designed to generate outputs based on the input without ethical filtering, which can be useful for certain applications but also pose risks.

📊 Table of Comparison

FeatureRegular LLMUncensored LLM
Content FilteringYes (aligned to avoid harmful content)No (generates responses as is)
Use CasesGeneral purpose, safer for public useSpecialized tasks needing raw output
Cultural AlignmentOften aligned with Western normsNo specific alignment
Risk of Harmful OutputLowerHigher
FlexibilityRestricted by ethical guidelinesHigher flexibility in responses

🐬 What is the Dolphin 2.9 Latest Model?

🐬Dolphin 2.9 is a project by Eric Hartford @ Cognitive Computations aimed at creating an open-source, uncensored, and commercially licensed dataset and series of instruct-tuned language models. This initiative is based on Microsoft’s Orca paper and seeks to provide a foundation for building customized models without the typical content restrictions found in conventional LLMs. The model uses a dataset that removes biases, alignment, or any form of censorship, aiming to create a purely instructional tool that can be layered with user-specific alignments.

🐬 The Dolphin 2.9 Dataset

Following are the details of the dataset used to train the Dolphin Model: (Note the base model is Llama-3-8b)

Dataset DetailsLinks
cognitivecomputations/dolphin – This dataset is an attempt to replicate the results of Microsoft’s Orca
cognitivecomputations/dolphin · Datasets at Hugging Face
HuggingFaceH4/ultrachat_200k – HuggingFaceH4/ultrachat_200k · Datasets at Hugging Face
teknium/OpenHermes-2.5 – This is the dataset that made OpenHermes 2.5 and Nous Hermes 2 series of models.teknium/OpenHermes-2.5 · Datasets at Hugging Face
microsoft/orca-math-word-problems-200k – This dataset contains ~200K grade school math word problems.microsoft/orca-math-word-problems-200k · Datasets at Hugging Face

💻 How to Run the Model Locally Using LMStudio

To run Dolphin or any similar uncensored model locally, you typically need to follow these steps, assuming you are using a system like LMStudio for managing your AI models:

  • Setup Your Environment:
    • Install LMStudio software its available on (Windows, Mac & Linux). This software allows you to manage and deploy local LLMs without you having to setup Python, Machine Learning, Transformers etc. libraries.
    • Link to Download the Windows Bits – LM Studio – Discover, download, and run local LLMs
    • My laptop config has 11th Gen Intel processor, 64 GB RAM & Nvdia RTX 3080 8 GB VRAM, 3 TB Storage.
  • Download the Model and Dependencies:
    • The best space to keep a track on models is Hugging Face – Models – Hugging Face. You can keep a track of the model releases and updates here.
    • Copy the model name from Hugging Face – cognitivecomputations/dolphin-2.9-llama3-8b
    • Paste this name in LM Studio and it will list out all the quantized models
    • In my case due to the configurations I selected 8Bit model. Please note lower the quantized version less accurate the model is.
    • Download of the model will take time depending upon your internet connection.
  • Prepare the Model for Running:
    • Within LMStudio click on the Chat interface to configure model settings.
    • Select the model from the drop down list – dolphin 2.9 llama3
    • You will be able to run it stock but I like to configure the Advanced Configurations
    • Based on your system set the GPU to 50/50 or max. I have setup for max
    • Click Relod model to apply configuration
  • Run the Model:
    • Use LMStudio to load and run the model.
    • Within the User Prompt enter what you want to ask the Dolphin model
    • Monitor the model’s performance and adjust settings as needed.
  • Testing and Usage:
    • Once the model is running, you can begin to input prompts and receive outputs.
    • Test the model with various inputs to ensure it functions as expected and adjust configurations as needed.
    • Note below was a test fun prompt across ChatGPT, Claude & Dolphin. You can clearly see the winner being Dolphin 🤗 
  • Eject and Closing the model:
    • Once you done with the session you can select Eject Model
    • This will release the VRAM/RAM and CPU utlization back to normal

💻 Quantized & GGUF Model

As home systems usually wont have the necessary GPU to run LLM models natively on consumer grade hardware. A quantized model is a compressed version of a neural network where the weights and activations are represented with lower-precision data types, such as int8 or uint8, instead of the typical float32. This reduces the model’s size and computational requirements while maintaining acceptable performance.

GGUF stands for “General-Purpose, GPU-Free”. It refers to a type of large language model that is designed to be versatile and capable of performing a wide range of natural language processing tasks without requiring expensive GPU hardware for inference.

The Dolphin 2.9 GGUF models are:

Model NameQuantizationModel SizeCPUGPUVRAMRAM
dolphin-2.9-llama3-8b-q3_K_M.gguf3-bit (q3)4.02 GBCompatible with most CPUsNot required for inferenceNot required for inference~4.02 GB
dolphin-2.9-llama3-8b-q4_K_M.gguf4-bit (q4)4.92 GBCompatible with most CPUsNot required for inferenceNot required for inference~4.92 GB
dolphin-2.9-llama3-8b-q5_K_M.gguf5-bit (q5)5.73 GBCompatible with most CPUsNot required for inferenceNot required for inference~5.73 GB
dolphin-2.9-llama3-8b-q6_K.gguf6-bit (q6)6.6 GBCompatible with most CPUsNot required for inferenceNot required for inference~6.6 GB
dolphin-2.9-llama3-8b-q8_0.gguf8-bit (q8)8.54 GBCompatible with most CPUsNot required for inferenceNot required for inference~8.54 GB

Reference Links

Following are the list of helpful links:

DescriptionLink
Details and background about the Dolphin ModelDolphin 🐬 (erichartford.com)
What are uncensored models?Uncensored Models (erichartford.com)
Various Dolphin Models on various base LLMscognitivecomputations (Cognitive Computations) (huggingface.co)
Dolphin Llama 3 8B GGUF model I used on LMStudio cognitivecomputations/dolphin-2.9-llama3-8b-gguf · Hugging Face
LM StudioLM Studio – Discover, download, and run local LLMs
Model Memory Estimator UtilityModel Memory Utility – a Hugging Face Space by hf-accelerate

By following these steps, you can deploy and utilize an uncensored LLM like Dolphin 2.9 for research, development, or any specialized application where conventional content restrictions are not desirable. I hope you’ll find this insightful on your joruney of LLMs. Please let me know if I’ve missed any steps or details, and I’ll be happy to update the post.

Thanks,
Aresh Sarkari