Hello Beck allows you to run open-source Large Language Models (LLMs) locally on your machine. It allows you to download, manage, and chat with various Large Language Models securely without needing an internet connection for the actual processing.
When you first launch Hello Beck, the application will ask you if you would like to download a "recommended LLM", this will be a good general purpose model well suited to your hardware to get started with.
A progress bar will appear under the "Available Models" list showing the download status and file size.


Type your question or prompt in the text box at the bottom and press the send button. The model will stream the text back to you.
When using reasoning models (also known as "thinking models"), you will often see output labeled Chain of Thought , before the model replies to your query. This Chain of Thought output shows the model's internal "working out" that it outputs first, to enable it to generate a better final answer. For example, if you ask it to solve a logic puzzle, the CoT section will show the AI testing different hypotheses before the final output appears below.

At the top of the chat window, you will see real-time statistics. These include how many there are, the time taken and the rate of processing for each of the following 3 stages.
Note that while the speed of processing of the CoT and Out tokens should be the same to within a margin of error, it's still sometimes useful to separate them to compare how many tokens or seconds are spent on CoT relative to the response output.
Original template Copyright 2020 Inovatik