2024-09-24 08:00:00
I just had the pleasure (cough) to connect an MSSQL database to a Laravel application at work. Because the process was super tedious, I wanted to quickly jot this down so I will never have to go through this again.
We're building a Laravel application with DDEV. DDEV essentially moves all development tools into Docker containers and adds some nice features like local database management.
Laravel comes with the boilerplate to use MSSQL out of the box. In your app, just set the database config to use sqlsrv
:
php
'connections' => [
'sqlsrv' => [
'driver' => 'sqlsrv',
'url' => env('DB_URL'),
'host' => env('DB_HOST', '127.0.0.1'),
'port' => env('DB_PORT', '1433'),
'database' => env('DB_DATABASE', 'laravel'),
'username' => env('DB_USERNAME', 'root'),
'password' => env('DB_PASSWORD', ''),
'unix_socket' => env('DB_SOCKET', ''),
'charset' => env('DB_CHARSET', 'utf8'),
'prefix' => '',
'prefix_indexes' => true,
// 'encrypt' => env('DB_ENCRYPT', 'yes'),
// 'trust_server_certificate' => env('DB_TRUST_SERVER_CERTIFICATE', 'false'),
],
],
You will see errors when starting your app, because you need to install the corresponding drivers first. Instead of adding them through Composer (a widely adopted package manager for PHP), you have to install the ODBC drivers through the system package manager, because Microsoft doesn't maintain a PHP package. Furthermore, you also have to install the driver repository because Microsoft doesn't even maintain packages for the major Linux distributions. In our setup with DDEV, this has to be done by amending the Dockerfile used for the application container. Create a file at .ddev/web-build/Dockerfile
and add the following contents:
`dockerfile
ARG BASEIMAGE FROM $BASEIMAGE
RUN npm install --global forever RUN echo "Built on $(date)" > /build-date.txt
RUN curl -fsSL https://packages.microsoft.com/keys/microsoft.asc | sudo gpg --dearmor -o /usr/share/keyrings/microsoft-prod.gpg RUN curl https://packages.microsoft.com/config/debian/12/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update RUN apt-get --allow-downgrades -y install libssl-dev RUN apt-get -y update && yes | ACCEPTEULA=Y apt-get -y install php8.3-dev php-pear unixodbc-dev htop RUN ACCEPTEULA=Y apt-get -y install msodbcsql18 mssql-tools18 RUN sudo pecl channel-update pecl.php.net RUN sudo pecl install sqlsrv RUN sudo pecl install pdo_sqlsrv
RUN sudo printf "; priority=20\nextension=sqlsrv.so\n" > /etc/php/8.3/mods-available/sqlsrv.ini RUN sudo printf "; priority=30\nextension=pdosqlsrv.so\n" > /etc/php/8.3/mods-available/pdosqlsrv.ini RUN sudo phpenmod -v 8.3 -s cli sqlsrv pdosqlsrv RUN sudo phpenmod -v 8.3 -s fpm sqlsrv pdosqlsrv RUN sudo phpenmod -v 8.3 -s apache2 sqlsrv pdo_sqlsrv
RUN sudo printf "; priority=20\nextension=sqlsrv.so\n" > /etc/php/8.3/mods-available/sqlsrv.ini RUN sudo printf "; priority=30\nextension=pdosqlsrv.so\n" > /etc/php/8.3/mods-available/pdosqlsrv.ini RUN sudo phpenmod -v 8.3 -s cli sqlsrv pdosqlsrv RUN sudo phpenmod -v 8.3 -s fpm sqlsrv pdosqlsrv RUN sudo phpenmod -v 8.3 -s apache2 sqlsrv pdo_sqlsrv
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
`
If you're reading this in the future and Microsoft may have released a new version of the ODBC drivers, you may have to follow the new installation instructions from their documentation. It took me a while to realize that I couldn't install version 17 of the driver because I was using the installation instructions for version 18. They are apparently incompatible with each other.
I hope that you'll never have to touch the shithole that is MSSQL, but if you do, I hope that this guide will be of value to you.
2024-09-01 08:00:00
The term "AI Slop" is currently on the rise. It describes all the AI generated images and texts we see on the internet. I'd like to propose a term that basically describes reverse AI Slop: Mental AI Fog.
Instead of consuming too much AI generated content (which also applies), AI Fog describes the inability to produce content without the help of AI. We're surrounded by flowery written articles and resources that we think that it's not worth it to put in the effort to write a text ourselves. This is comparable to how computer keyboards, spellchecking and autocorrection have rendered my generation and the ones to come incapable of comfortably writing longform text.
I'm currently strongly suffering from AI fog. I'm so used to letting some LLM flesh out a quick and dirty prompt nowadays that it's hard for me to write this text, get the point across and not feel insecure about my style of writing. This site is supposed to be a way to persist my thoughts whenever I want to, but are they still my thoughts if they have been proofread and corrected by a computer?
As a result, all these thoughts are piling up in my head. Where I previously braindumped thoughts on a piece of paper, I now only come up with a couple of words and let the AI elaborate. I'm consuming what are supposed to be my own thoughts, which perpetuates the cycle.
This needs to stop. I need to get back creating things myself. I decided to abandon the use of LLMs for most content on this site. And where AI has been used, it will be clearly mentioned. I'm contemplating adding some sort of "Backed by AI" label for certain pages to make it harder for myself to fall back to a helping hand. I will likely still be using LLMs, but making it obvious will force me to mindfully choose where it will be used.
Is this something you can relate to? Is AI fog even a fitting term for this? I don't know. And if it isn't, that's okay because I came up with it myself.
2024-08-31 08:00:00
I just rewrote parts of my Positive Hacker News RSS Feed project to use an ML model to filter out any negative news from the Hacker News timeline. This method is far more reliable than the previous method of using a rule-based sentiment analyzer through NLTK.
I'm using the model cardiffnlp/twitter-roberta-base-sentiment-latest, which was trained on a huge amount of tweets. It's really tiny (~500 MB) and easily runs inside the existing GitHub Actions workflows. You can try out the model yourself on the HuggingFace model card.
<img width="522" alt="grafik" src="https://github.com/user-attachments/assets/06f42df6-624a-4108-ada8-d0d37a53e693">
If you want to subscribe to more positive tech news, simply replace your Hacker News feed of your RSS reader with this one (or add it if you haven't already): https://garritfra.github.io/positive_hackernews/feed.xml
2024-08-03 08:00:00
Embedding models have long been a daunting concept for me. But what are they? And why are they so useful? Let's break it down in simple terms.
An embedding is basically a numerical representation of a piece of information - it could be text, audio, an image, or even a video. Think of it as a way to capture the essence or meaning of that information in a list of numbers.
For example, let's say we have this text: "show me a list of ground transportation at boston airport". An embedding model might turn that into something like this:
[0.03793335, -0.008010864, -0.002319336, -0.0110321045, -0.019882202, -0.023864746, 0.011428833, -0.030349731, -0.044830322, 0.028289795, -0.02810669, -0.0032749176, -0.04208374, -0.0077705383, -0.0033798218, -0.06335449, ... ]
At first, thus looks like a jumble of numbers. But each of these numbers points to a specific area within the embedding model's "space", where similar words or concepts might be located.
To help wrap our heads around this, let's look at a visualization. This beautiful image shows the entirety of the nomic-embed-text-v1.5 embedding model, as generated by this visualization tool:
Now, if we take our example text about Boston airport transportation and plot its embeddings on this map, we'd see that some clusters are lit up, especially around "transportation". This means that the model has figured out that the topic of the query must be related to transportation in some way.
Zooming into this image, we can see more specific topics around transportation, like "Airport", "Travel" or "Highways" are lit up, which more closely matches our query.
In a nutshell, embedding models are able to group terms by topics that are related to each other.
Encoding meaning in text has tons of different use cases. One that I'm particularly excited about is building RAG applications. RAG stands for Retrieval-Augmented Generation and refers to a method for Large Language Models (LLMs), where, given a question, you enrich the original question with relevant bits of information before answering it.
Here's how embeddings are useful for RAG:
This method is way better than previously used techniques like just searching for exact words in the documents. It's like the difference between having a librarian who only looks at book titles, and one who actually understands what the books are about.
Beyond RAG applications, embeddings are super useful for all sorts of things:
Embeddings are a clever way to turn words (or images, or sounds) into numbers that computers can understand and compare. By doing this, we can make emerging AI technologies a whole lot smarter at understanding language and finding connections between ideas.
Next time you're chatting with an AI or getting scarily accurate recommendations online, you can nod knowingly and think, "Ah yes, embeddings at work!"
2024-07-02 08:00:00
The more I'm getting into large language models (LLMs), the more I'm fascinated about what you can do with them. To "digest" my reading list of cool articles and projects regarding LLMs, I assembled the following list. If you're also interested but haven't started your journey into this neverending rabbit hole, these may contain some good pointers:
2024-06-27 08:00:00
Just a quick note to my future self on how to test a SMTP connection with nothing but a tiny busybox container.
In my case specifically, I tested the connection from inside a Kubernetes cluster. Here's the quickest way to get a temporary pod up and running:
kubectl run -n backend -i --tty --rm debug --image=busybox --restart=Never
Busybox comes with telnet installed, which we can use to establish a connection to the server:
/ # telnet smtp.mydomain.com 25
Connected to smtp.mydomain.com
220 mail.mydomain.com ESMTP Postfix (SMTP)
Next, we can issue the SMTP commands through the open TCP connection to send a test mail. Lines beginning with a status code are server responses:
`
HELO smtp.mydomain.com
250 smtp.mydomain.com
MAIL FROM:[email protected]
250 2.1.0 Ok
RCPT TO:[email protected]
250 2.1.5 Ok
DATA
354 End data with <CR><LF>.<CR><LF>
From: [noreply] [email protected]
To: [Receiver] [email protected]
Date: Thu, 27 Jun 2024 10:08:26 -0200
Subject: Test Message
This is a test message.
.
250 2.0.0 Ok: queued as 2478B7F135
`
In case there's a firewall issue, you might not be able to establish a connection in the first place, or you won't get a reply to your TCP commands. In our case, everything worked fine.
I hope this is useful!