Hire a web Developer and Designer to upgrade and boost your online presence with cutting edge Technologies

Saturday, February 1, 2025

Solo Development: Learning To Let Go Of Perfection

 The best and worst thing about solo development is the “solo” part. There’s a lot of freedom in working alone, and that freedom can be inspiring, but it can also become a debilitating hindrance to productivity and progress. Victor Ayomipo shares his personal lessons on what it takes to navigate solo development and build the “right” app.

As expected from anyone who has ever tried building anything solo, my goal was not to build an app but the app — the one app that’s so good you wonder how you ever survived without it. I had everything in place: wireframes, a to-do list, project structure — you name it. Then I started building. Just not the product. I started with the landing page for it, which took me four days, and I hadn’t even touched the app’s core features yet. The idea itself was so good I had to start marketing it right away!

I found myself making every detail perfect: every color, shadow, gradient, font size, margin, and padding had to be spot on. I don’t even want to say how long the logo took.

Spoiler:
No one cares about your logo.

Why did I get so stuck on something that was never even part of the core app I wanted so badly to build? Why wasn’t I nagging myself to move on when I clearly needed to?

The reality of solo development is that there is no one to tell you when to stop or simply say, “Yo, this is good enough! Move on.“ Most users don’t care whether a login button is yellow or green. What they want (and need) is a button that works and solves their problem when clicking it.

Test Early And Often

Unnecessary tweaks, indecisive UI decisions, and perfectionism are the core reasons I spend more time on things than necessary.

Like most solo developers, I also started with the hope of pushing out builds with the efficiency of a large-scale team. But it is easier said than done.

When building solo, you start coding, then you maybe notice a design flaw, and you switch to fixing it, then a bug appears, and you try fixing that, and voilĂ  — the day is gone. There comes a time when it hits you that, “You know what? It’s time to build messy.” That’s when good intentions of project and product management go out the window, and that’s when I find myself working by the seat of my pants rather than plowing forward with defined goals and actionable tasks that are based on good UI/UX principles, like storyboards, user personas, and basic prioritization.

This realization is something you have to experience to grasp fully. The trick I’ve learned is to focus on getting something out there for people to see and then work on actual feedback. In other words,

It’s more important to get the idea out there and iterate on it than reaching for perfection right out of the gate.

Because guess what? Even if you have the greatest app idea in the world, you’re never going to make it perfect until you start receiving feedback on it. You’re no mind reader — as much as we all want to be one — and some insights (often the most relevant) can only be received through real user feedback and analytics. Sure, your early assumptions may be correct, but how do you know until you ship them and start evaluating them?

Nowadays, I like to tell others (and myself) to work from hypotheses instead of absolutes. Make an assertion, describe how you intend to test it, and then ship it. With that, you can gather relevant insights that you can use to get closer to perfection — whatever that is.

Strength In Recognizing Weakness

Let’s be real: Building a full application on your own is not an easy feat. I’d say it’s like trying to build a house by yourself; it seems doable, but the reality is that it takes a lot more hands than the ones you have to make it happen. And not only to make it happen but to make it happen well.

There’s only so much one person can do, and admitting your strengths and weaknesses up-front will serve you well by avoiding the trap that you can do it all alone.

I once attempted to build a project management app alone. I knew it might be difficult, but I was confident. Within a few days, this “simple” project grew legs and expanded with new features like team collaboration, analytics, time tracking, and custom reports being added, many of which I was super excited to make.

Building a full app takes a lot of time. Think about it; you’re doing the work of a team all alone without any help. There’s no one to provide you with design assets, content, or back-end development. No stakeholder to “swoop and poop” on your ideas (which might be a good thing). Every decision, every line of code, and every design element is 100% on you alone.

It is technically possible to build a full-featured app solo, but when you think about it, there’s a reason why the concept of MVP exists. Take Instagram, for example; it wasn’t launched with reels, stories, creator’s insights, and so on. It started with one simple thing: photo sharing.

All I’m trying to say is start small, launch, and let users guide the evolution of the product. And if you can recruit more hands to help, that would be even better. Just remember to leverage your strengths and reinforce your weaknesses by leaning on other people’s strengths.

Yes, Think Like an MVP

The concept of a minimum viable product (MVP) has always been fascinating to me. In its simplest form, it means building the basic version of your idea that technically works and getting it in front of users. Yes, this is such a straightforward and widely distributed tip, but it’s still one of the hardest principles for solo developers to follow, particularly for me.

I mentioned earlier that my “genius” app idea grew legs. And lots of them. I had more ideas than I knew what to do with, and I hadn’t even written a reasonable amount of code! Sure, this app could be enhanced to support face ID, dark mode, advanced security, real-time results, and a bunch of other features. But all these could take months of development for an app that you’re not even certain users want.

I’ve learned to ask myself: “What would this project look like if it was easy to build?”. It’s so surreal how the answer almost always aligns with what users want. If you can distill your grand idea into a single indispensable idea that does one or two things extremely well, I think you’ll find — as I have — that the final result is laser-focused on solving real user problems.

Ship the simplest version first. Dark mode can wait. All you need is a well-defined idea, a hypothesis to test, and a functional prototype to validate that hypothesis; anything else is probably noise.

Handle Imperfection Gracefully

You may have heard about the “Ship it Fast” approach to development and instantly recognize the parallels between it and what I’ve discussed so far. In a sense, “Ship it Fast” is ultimately another way of describing an MVP: get the idea out fast and iterate on it just as quickly.

Some might disagree with the ship-fast approach and consider it reckless and unprofessional, which is understandable because, as developers, we care deeply about the quality of our work. However,

The ship-fast mentality is not to ignore quality but to push something out ASAP and learn from real user experiences. Ship it now — perfect it later.

That’s why I like to tell other developers that shipping an MVP is the safest, most professional way to approach development. It forces you to stay in scope and on task without succumbing to your whimsies. I even go so far as to make myself swear an “Oath of Focus” at the start of every project.

I, Vayo, hereby solemnly swear (with one hand on this design blueprint) to make no changes, no additions, and no extra features until this app is fully built in all its MVP glory. I pledge to avoid the temptations of endless tweaking and the thoughts of “just one more feature.”

Only when a completed prototype is achieved will I consider any new features, enhancements, or tweaks.

Signed,
Vayo, Keeper of the MVP

Remember, there’s no one there to hold you accountable when you develop on your own. Taking a brief moment to pause and accepting that my first version won’t be flawless helps put me in the right headspace early in the project.

Prioritize What Matters

I have noticed that no matter what I build, there’s always going to be bugs. Always. If Google still has bugs in the Google Notes app, trust me, then it’s fine for a solo developer to accept that bugs will always be a part of any project.

Look at flaky tests. For instance, you could run a test over 1,000 times and get all greens, and then the next day, you run the same test, an error shows. It’s just the nature of software development. And for the case of endlessly adding features, it never ends either. There’s always going to be a new feature that you’re excited about. The challenge is to curb some of that enthusiasm and shelve it responsibly for a later time when it makes sense to work on it.

I’ve learned to categorize bugs and features into two types: intrusive and non-intrusive. Intrusive are those things that prevent projects from functioning properly until fixed, like crashes and serious errors. The non-intrusive items are silent ones. Sure, they should be fixed, but the product will work just fine and won’t prevent users from getting value if they aren’t addressed right away.

You may want to categorize your bugs and features in other ways, and I’ve seen plenty of other examples, including:

  • High value, low value;
  • High effort, low effort;
  • High-cost, low-cost;
  • Need to have, nice to have.

I’ve even seen developers and teams use these categorizations to create some fancy priority “score” that considers each category. Whatever it is that helps you stay focused and on-task is going to be the right approach for you more than what specific category you use.

Live With Your Stack

Here’s a classic conundrum in development circles:

Should I use React? Or NextJS? Or wait, how about Vue? I heard it’s more optimized. But hold on, I read that React Redux is dead and that Zustand is the new hot tool.

And just like that, you’ve spent an entire day thinking about nothing but the tech stack you’re using to build the darn thing.

We all know that an average user could care less about the tech stack under the hood. Go ahead and ask your mom what tech stack WhatsApp is built on, and let me know what she says. Most times, it’s just us who obsesses about tech stacks, and that usually only happens when we’re asked to check under the hood.

I have come to accept that there will always be new tech stacks released every single day with the promise of 50% performance and 10% less code. That new tool might scale better, but do I actually have a scaling problem with my current number of zero users? Probably not.

My advice:

Pick the tools you work with best and stick to those tools until they start working against you.

There’s no use fighting something early if something you already know and use gets the job done. Basically, don’t prematurely optimize or constantly chase the latest shiny object.

Do Design Before The First Line of Code

I know lots of solo developers out there suck at design, and I’m probably among the top 50. My design process has traditionally been to open VS Code, create a new project, and start building the idea in whatever way comes to mind. No design assets, comps, or wireframes to work with — just pure, unstructured improvisation. That’s not a good idea, and it’s a habit I’m actively trying to break.

These days, I make sure to have a blueprint of what I’m building before I start writing code. Once I have that, I make sure to follow through and not change anything to respect my “Oath of Focus.”

I like how many teams call comps and wireframes “project artifacts.” They are pieces of evidence that provide a source of truth for how something looks and works. You might be the sort of person who works better with sets of requirements, and that’s totally fine. But having some sort of documentation that you can point back to in your work is like having a turn-by-turn navigation on a long road trip — it’s indispensable for getting where you need to go.

And what if you’re like me and don’t pride yourself on being the best designer? That’s another opportunity to admit your weaknesses up-front and recruit help from someone with those strengths. That way, you can articulate the goal and focus on what you’re good at.

Give Yourself Timelines

Personally, without deadlines, I’m almost unstoppable at procrastinating. I’ve started setting time limits when building any project, as it helps with procrastination and makes sure something is pushed out at a specified time. Although this won’t work without accountability, I feel the two work hand in hand.

I set a 2–3 week deadline to build a project. And no matter what, as soon as that time is up, I must post or share the work in its current state on my socials. Because of this, I’m not in my comfort zone anymore because I won’t want to share a half-baked project with the public; I’m conditioned to work faster and get it all done. It’s interesting to see the length of time you can go if you can trick your brain.

I realize that this is an extreme constraint, and it may not work for you. I’m just the kind of person who needs to know what my boundaries are. Setting deadlines and respecting them makes me a more disciplined developer. More than that, it makes me work efficiently because I stop overthinking things when I know I have a fixed amount of time, and that leads to faster builds.

Conclusion

The best and worst thing about solo development is the “solo” part. There’s a lot of freedom in working alone, and that freedom can be inspiring. However, all that freedom can be intoxicating, and if left unchecked, it becomes a debilitating hindrance to productivity and progress. That’s a good reason why solo development isn’t for everyone. Some folks will respond a lot better to a team environment.

But if you are a solo developer, then I hope my personal experiences are helpful to you. I’ve had to look hard at myself in the mirror many days to come to realize that I am not a perfect developer who can build the “perfect” app alone. It takes planning, discipline, and humility to make anything, especially the right app that does exactly the right thing.

Ideas are cheap and easy, but stepping out of our freedom and adding our own constraints based on progress over perfection is the secret sauce that keeps us moving and spending our time on those essential things.

Friday, January 31, 2025

How To Design For High-Traffic Events And Prevent Your Website From Crashing

 Product drops and sales are a great way to increase revenue, but these events can result in traffic spikes that affect a site’s availability and performance. To prevent website crashes, you’ll have to make sure that the sites you design can handle large numbers of server requests at once. Let’s discuss how!

 

Product launches and sales typically attract large volumes of traffic. Too many concurrent server requests can lead to website crashes if you’re not equipped to deal with them. This can result in a loss of revenue and reputation damage.

The good news is that you can maximize availability and prevent website crashes by designing websites specifically for these events. For example, you can switch to a scalable cloud-based web host, or compress/optimize images to save bandwidth.

In this article, we’ll discuss six ways to design websites for high-traffic events like product drops and sales:

How To Design For High-Traffic Events

Let’s take a look at six ways to design websites for high-traffic events, without worrying about website crashes and other performance-related issues.

1. Compress And Optimize Images

One of the simplest ways to design a website that accommodates large volumes of traffic is to optimize and compress images. Typically, images have very large file sizes, which means they take longer for browsers to parse and display. Additionally, they can be a huge drain on bandwidth and lead to slow loading times.

You can free up space and reduce the load on your server by compressing and optimizing images. It’s a good idea to resize images to make them physically smaller. You can often do this using built-in apps on your operating system.

There are also online optimization tools available like Tinify, as well as advanced image editing software like Photoshop or GIMP:

GIMP

Image format is also a key consideration. Many designers rely on JPG and PNG, but adaptive modern image formats like WebP can reduce the weight of the image and provide a better user experience (UX).

You may even consider installing an image optimization plugin or an image CDN to compress and scale images automatically. Additionally, you can implement lazy loading, which prioritizes the loading of images above the fold and delays those that aren’t immediately visible.

2. Choose A Scalable Web Host

The most convenient way to design a high-traffic website without worrying about website crashes is to upgrade your web hosting solution.

Traditionally, when you sign up for a web hosting plan, you’re allocated a pre-defined number of resources. This can negatively impact your website performance, particularly if you use a shared hosting service.

Upgrading your web host ensures that you have adequate resources to serve visitors flocking to your site during high-traffic events. If you’re not prepared for this eventuality, your website may crash, or your host may automatically upgrade you to a higher-priced plan.

Therefore, the best solution is to switch to a scalable web host like Cloudways Autonomous:

Cloudways

This is a fully managed WordPress hosting service that automatically adjusts your web resources based on demand. This means that you’re able to handle sudden traffic surges without the hassle of resource monitoring and without compromising on speed.

With Cloudways Autonomous your website is hosted on multiple servers instead of just one. It uses Kubernetes with advanced load balancing to distribute traffic among these servers. Kubernetes is capable of spinning up additional pods (think of pods as servers) based on demand, so there’s no chance of overwhelming a single server with too many requests.

High-traffic events like sales can also make your site a prime target for hackers. This is because, in high-stress situations, many sites enter a state of greater vulnerability and instability. But with Cloudways Autonomous, you’ll benefit from DDoS mitigation and a web application firewall to improve website security.

3. Use A CDN

As you’d expect, large volumes of traffic can significantly impact the security and stability of your site’s network. This can result in website crashes unless you take the proper precautions when designing sites for these events.

A content delivery network (CDN) is an excellent solution to the problem. You’ll get access to a collection of strategically-located servers, scattered all over the world. This means that you can reduce latency and speed up your content delivery times, regardless of where your customers are based.

When a user makes a request for a website, they’ll receive content from a server that’s physically closest to their location. Plus, having extra servers to distribute traffic can prevent a single server from crashing under high-pressure conditions. Cloudflare is one of the most robust CDNs available, and luckily, you’ll get access to it when you use Cloudways Autonomous.

You can also find optimization plugins or caching solutions that give you access to a CDN. Some tools like Jetpack include a dedicated image CDN, which is built to accommodate and auto-optimize visual assets.

4. Leverage Caching

When a user requests a website, it can take a long time to load all the HTML, CSS, and JavaScript contained within it. Caching can help your website combat this issue.

A cache functions as a temporary storage location that keeps copies of your web pages on hand (once they’ve been requested). This means that every subsequent request will be served from the cache, enabling users to access content much faster.

The cache mainly deals with static content like HTML which is much quicker to parse compared to dynamic content like JavaScript. However, you can find caching technologies that accommodate both types of content.

There are different caching mechanisms to consider when designing for high-traffic events. For example, edge caching is generally used to cache static assets like images, videos, or web pages. Meanwhile, database caching enables you to optimize server requests.

If you’re expecting fewer simultaneous sessions (which isn’t likely in this scenario), server-side caching can be a good option. You could even implement browser caching, which affects static assets based on your HTTP headers.

There are plenty of caching plugins available if you want to add this functionality to your site, but some web hosts provide built-in solutions. For example, Cloudways Autonomous uses Cloudflare’s edge cache and integrated object cache.

5. Stress Test Websites

One of the best ways to design websites while preparing for peak traffic is to carry out comprehensive stress tests.

This enables you to find out how your website performs in various conditions. For instance, you can simulate high-traffic events and discover the upper limits of your server’s capabilities. This helps you avoid resource drainage and prevent website crashes.

You might have experience with speed testing tools like Pingdom, which assess your website performance. But these tools don’t help you understand how performance may be impacted by high volumes of traffic.

Therefore, you’ll need to use a dedicated stress test tool like Loader.io:

Loader.io

This is completely free to use, but you’ll need to register for an account and verify your website domain. You can then download your preferred file and upload it to your server via FTP.

After that, you’ll find three different tests to carry out. Once your test is complete, you can take a look at the average response time and maximum response time, and see how this is affected by a higher number of clients.

6. Refine The Backend

The final way to design websites for high-traffic events is to refine the WordPress back end.

The admin panel is where you install plugins, activate themes, and add content. The more of these features that you have on your site, the slower your pages will load.

Therefore, it’s a good idea to delete any old pages, posts, and images that are no longer needed. If you have access to your database, you can even go in and remove any archived materials.

On top of this, it’s best to remove plugins that aren’t essential for your website to function. Again, with database access, you can get in there and delete any tables that sometimes get left behind when you uninstall plugins via the WordPress dashboard.

When it comes to themes, you’ll want to opt for a simple layout with a minimalist design. Themes that come with lots of built-in widgets or rely on third-party plugins will likely add bloat to your loading times. Essentially, the lighter your back end, the quicker it will load.

Conclusion

Product drops and sales are a great way to increase revenue, but these events can result in traffic spikes that affect a site’s availability and performance. To prevent website crashes, you’ll have to make sure that the sites you design can handle large numbers of server requests at once.

The easiest way to support fluctuating traffic volumes is to upgrade to a scalable web hosting service like Cloudways Autonomous. This way, you can adjust your server resources automatically, based on demand. Plus, you’ll get access to a CDN, caching, and an SSL certificate. Get started today!

Thursday, January 30, 2025

Svelte 5 And The Future Of Frameworks: A Chat With Rich Harris

 After months of anticipation, debate, and even a bit of apprehension, Svelte 5 arrived earlier this year. Frederick O’Brien caught up with its creator, Rich Harris, to talk about the path that brought him and his team here and what lies ahead.

Svelte occupies a curious space within the web development world. It’s been around in one form or another for eight years now, and despite being used by the likes of Apple, Spotify, IKEA, and the New York Times, it still feels like something of an upstart, maybe even a black sheep. As creator Rich Harris recently put it,

“If React is Taylor Swift, we’re more of a Phoebe Bridges. She’s critically acclaimed, and you’ve heard of her, but you probably can’t name that many of her songs.”

— Rich Harris

This may be why the release of Svelte 5 in October this year felt like such a big deal. It tries to square the circle of convention and innovation. Can it remain one of the best-loved frameworks on the web while shaking off suspicions that it can’t quite rub shoulders with React, Vue, and others when it comes to scalability? Whisper it, but they might just have pulled it off. The post-launch reaction has been largely glowing, with weekly npm downloads doubling compared to six months ago.

Still, I’m not in the predictions game. The coming months and years will be the ultimate measure of Svelte 5. And why speculate on the most pressing questions when I can just ask Rich Harris myself? He kindly took some time to chat with me about Svelte and the future of web development.

Not Magic, But Magical

Svelte 5 is a ground-up rewrite. I don’t want to get into the weeds here — key changes are covered nicely in the migration guide — but suffice it to say the big one where day-to-day users are concerned is runes. At times, magic feeling $ has given way to the more explicit $state, $derived, and $effect.

A lot of the talk around Svelte 5 included the sentiment that it marks the ‘maturation’ of the framework. To Harris and the Svelte team, it feels like a culmination, with lessons learned combined with aspirations to form something fresh yet familiar.

“This does sort of feel like a new chapter. I’m trying to build something that you don’t feel like you need to get a degree in it before you can be productive in it. And that seems to have been carried through with Svelte 5.”

— Rich Harris

Although raw usage numbers aren’t everything, seeing the uptick in installations has been a welcome signal for Harris and the Svelte team.

“For us, success is definitely not based around adoption, though seeing the number go up and to the right gives us reassurance that we’re doing the right thing and we’re on the right track. Even if it’s not the goal, it is a useful indication. But success is really people building their apps with this framework and building higher quality, more resilient, more accessible apps.”

— Rich Harris

The tenets of a Svelte philosophy outlined by Harris earlier this year reinforce the point:

  1. The web matters.
  2. Optimise for vibes.
  3. Don’t optimise for adoption.
  4. HTML, The Mother Language.
  5. Embrace progress.
  6. Numbers lie.
  7. Magical, not magic.
  8. Dream big.
  9. No one cares.
  10. Design by consensus.

Click the link above to hear these expounded upon, but you get the crux. Svelte is very much a qualitative project. Although Svelte performs well in a fair few performance metrics itself, Harris has long been a critic of metrics like Lighthouse being treated as ends in themselves. Fastest doesn’t necessarily mean best. At the end of the day, we are all in the business of making quality websites.

Rich Harris – North Star, JSNation US 2024

Frameworks are a means to that end, and Harris sees plenty of work to be done there.

Software Is Broken

Every milestone is a cause for celebration. It’s also a natural pause in which to ask, “Now what?” For the Svelte team, the sights seem firmly set on shoring up the quality of the web.

“A conclusion that we reached over the course of a recent discussion is that most software in the world is kind of terrible. Things are not good. Half the stuff on my phone just doesn’t work. It fails at basic tasks. And the same is true for a lot of websites. The number of times I’ve had to open DevTools to remove the disabled attribute from a button so that I can submit a form, or been unclear on whether a payment went through or not.”

— Rich Harris

This certainly meshes with my experience and, doubtless, countless others. Between enshittification, manipulative algorithms, and the seemingly endless influx of AI-generated slop, it’s hard to shake the feeling that the web is becoming increasingly decadent and depraved.

“So many pieces of software that we use are just terrible. They’re just bad software. And it’s not because software engineers are idiots. Our main priority as toolmakers should be to enable people to build software that isn’t broken. As a baseline, people should be able to build software that works.”

— Rich Harris

This sense of responsibility for the creation and maintenance of good software speaks to the Svelte team’s holistic outlook and also looks to influence priorities going forward.

Brave New World

Part of Svelte 5 feels like a new chapter in the sense of fresh foundations. Anyone who’s worked in software development or web design will tell you how much of a headache ground-up rewrites are. Rebuilding the foundations is something to celebrate when you pull it off, but it also begs the question: What are the foundations for?

Harris has his eyes on the wider ecosystem around frameworks.

“I don’t think there’s a lot more to do to solve the problem of taking some changing application state and turning it into DOM, but I think there’s a huge amount to be done around the ancillary problems. How do we load the data that we put in those components? Where does that data live? How do we deploy our applications?”

— Rich Harris

In the short to medium term, this will likely translate into some love for SvelteKit, the web application framework built around Svelte. The framework might start having opinions about authentication and databases, an official component library perhaps, and dev tools in the spirit of the Astro dev toolbar. And all these could be precursors to even bigger explorations.

“I want there to be a Rails or a Laravel for JavaScript. In fact, I want there to be multiple such things. And I think that at least part of Svelte’s long-term goal is to be part of that. There are too many things that you need to learn in order to build a full stack application today using JavaScript.”

— Rich Harris

Onward

Although Svelte has been ticking along happily for years, the release of version 5 has felt like a new lease of life for the ecosystem around it. Every day brings new and exciting projects to the front page of the /r/sveltejs subreddit, while this year’s Advent of Svelte has kept up a sense of momentum following the stable release.

Below are just a handful of the Svelte-based projects that have caught my eye:

Despite the turbulence and inescapable sense of existential dread surrounding much tech, this feels like an exciting time for web development. The conditions are ripe for lovely new things to emerge.

And as for Svelte 5 itself, what does Rich Harris say to those who might be on the fence?

“I would say you have nothing to lose but an afternoon if you try it. We have a tutorial that will take you from knowing nothing about Svelte or even existing frameworks. You can go from that to being able to build applications using Svelte in three or four hours. If you just want to learn Svelte basics, then that’s an hour. Try it.”

— Rich Harris

What Does AI Really Mean?

 We, as human beings, don’t worry too much about making sure the connections land at the right point. Our brain just works that way, declaratively. However, for building AI, we need to be more explicit. Let’s dive in!

In 2024, Artificial Intelligence (AI) hit the limelight with major advancements. The problem with reaching common knowledge and so much public attention so quickly is that the term becomes ambiguous. While we all have an approximation of what it means to “use AI” in something, it’s not widely understood what infrastructure having AI in your project, product, or feature entails.

So, let’s break down the concepts that make AI tick. How is data stored and correlated, and how are the relationships built in order for an algorithm to learn how to interpret that data? As with most data-oriented architectures, it all starts with a database.

Data As Coordinates

Creating intelligence, whether artificial or natural, works in a very similar way. We store chunks of information, and we then connect them. Multiple visualization tools and metaphors show this in a 3-dimensional space with dots connected by lines on a graph. Those connections and their intersection are what make up for intelligence. For example, we put together “chocolate is sweet and nice” and “drinking hot milk makes you warm”, and we make “hot chocolate”.

We, as human beings, don’t worry too much about making sure the connections land at the right point. Our brain just works that way, declaratively. However, for building AI, we need to be more explicit. So think of it as a map. In order for a plane to leave CountryA and arrive at CountryB it requires a precise system: we have coordinates, we have 2 axis in our maps, and they can be represented as a vector: [28.3772, 81.5707].

For our intelligence, we need a more complex system; 2 dimensions will not suffice; we need thousands. That’s what vector databases are. Our intelligence can now correlate terms based on the distance and/or angle between them, create cross-references, and establish patterns in which every term occurs.

A specialized database that stores and manages data as high-dimensional vectors. It enables efficient similarity searches and semantic matching.

Querying Per Approximation

As stated in the last session, matching the search terms (your prompt) to the data is the exercise of semantic matching (it establishes the pattern in which keywords in your prompt are used within its own data), and the similarity search, the distance (angular or linear) between each entry. That’s actually a roughly accurate representation. What a similarity search does is define each of the numbers in a vector (that’s thousands of coordinates long), a point in this weird multi-dimensional space. Finally, to establish similarity between each of these points, the distance and/or angles between them are measured.

This is one of the reasons why AI isn’t deterministic — we also aren’t — for the same prompt, the search may produce different outputs based on how the scores are defined at that moment. If you’re building an AI system, there are algorithms you can use to establish how your data will be evaluated.

This can produce more precise and accurate results depending on the type of data. The main algorithms used are 3, and Each one of them performs better for a certain kind of data, so understanding the shape of the data and how each of these concepts will correlate is important to choosing the correct one. In a very hand-wavy way, here’s the rule-of-thumb to offer you a clue for each:

  • Cosine Similarity
    Measures angle between vectors. So if the magnitude (the actual number) is less important. It’s great for text/semantic similarity
  • Dot Product
    Captures linear correlation and alignment. It’s great for establishing relationships between multiple points/features.
  • Euclidean Distance
    Calculates straight-line distance. It’s good for dense numerical spaces since it highlights the spatial distance.
INFO

When working with non-structured data (like text entries: your tweets, a book, multiple recipes, your product’s documentation), cosine similarity is the way to go.

Now that we understand how the data bulk is stored and the relationships are built, we can start talking about how the intelligence works — let the training begin!

Language Models

A language model is a system trained to understand, predict, and finally generate human-like text by learning statistical patterns and relationships between words and phrases in large text datasets. For such a system, language is represented as probabilistic sequences.

In that way, a language model is immediately capable of efficient completion (hence the quote stating that 90% of the code in Google is written by AI — auto-completion), translation, and conversation. Those tasks are the low-hanging fruits of AI because they depend on estimating the likelihood of word combinations and improve by reaffirming and adjusting the patterns based on usage feedback (rebalancing the similarity scores).

As of now, we understand what a language model is, and we can start classifying them as large and small.

Large Language Models (LLMs)

As the name says, use large-scale datasets &mdash with billions of parameters, like up to 70 billion. This allows them to be diverse and capable of creating human-like text across different knowledge domains. Think of them as big generalists. This makes them not only versatile but extremely powerful. And as a consequence, training them demands a lot of computational work.

Small Language Models (SLMs)

With a smaller dataset, with numbers ranging from 100 million to 3 billion parameters. They take significantly less computational effort, which makes them less versatile and better suited for specific tasks with more defined constraints. SLMs can also be deployed more efficiently and have a faster inference when processing user input.

Fine-Tunning

Fine-tuning an LLM consists of adjusting the model’s weights through additional specialized training on a specific (high-quality) dataset. Basically, adapting a pre-trained model to perform better in a particular domain or task.

As training iterates through the heuristics within the model, it enables a more nuanced understanding. This leads to more accurate and context-specific outputs without creating a custom language model for each task. On each training iteration, developers will tune the learning rate, weights, and batch-size while providing a dataset tailored for that particular knowledge area. Of course, each iteration depends also on appropriately benchmarking the output performance of the model.

As mentioned above, fine-tuning is particularly useful for applying a determined task with a niche knowledge area, for example, creating summaries of nutritional scientific articles, correlating symptoms with a subset of possible conditions, etc.

Fine-tuning is not something that can be done frequently or fast, requiring numerous iterations, and it isn’t intended for factual information, especially if dependent on current events or streamed information.

Enhancing Context With Information

Most conversations we have are directly dependent on context; with AI, it isn’t so much different. While there are definitely use cases that don’t entirely depend on current events (translations, summarization, data analysis, etc.), many others do. However, it isn’t quite feasible yet to have LLMs (or even SLMs) being trained on a daily basis.

For this, a new technique can help: Retrieve-Augmented Generation (RAG). It consists of injecting a smaller dataset into the LLMs in order to provide it with more specific (and/or current) information. With a RAG, the LLM isn’t better trained; it still has all the generalistic training it had before — but now, before it generates the output, it receives an ingest of new information to be used.

INFO

RAG enhances the LLM’s context, providing it with a more comprehensive understanding of the topic.

For an RAG to work well, data must be prepared/formatted in a way that the LLM can properly digest it. Setting it up is a multi-step process:

  1. Retrieval
    Query external data (such as web pages, knowledge bases, and databases).
  2. Pre-Processing
    Information undergoes pre-processing, including tokenization, stemming, and removal of stop words.
  3. Grounded Generation
    The pre-processed retrieved information is then seamlessly incorporated into the pre-trained LLM.

RAG first retrieves relevant information from a database using a query generated by the LLM. Integrating an RAG to an LLM enhances its context, providing it with a more comprehensive understanding of the topic. This augmented context enables the LLM to generate more precise, informative, and engaging responses.

Since it provides access to fresh information via easy-to-update database records, this approach is mostly for data-driven responses. Because this data is context-focused, it also provides more accuracy to facts. Think of a RAG as a tool to turn your LLM from a generalist into a specialist.

Enhancing an LLM context through RAG is particularly useful for chatbots, assistants, agents, or other usages where the output quality is directly connected to domain knowledge. But, while RAG is the strategy to collect and “inject” data into the language model’s context, this data requires input, and that is why it also requires meaning embedded.

Embedding

To make data digestible by the LLM, we need to capture each entry’s semantic meaning so the language model can form the patterns and establish the relationships. This process is called embedding, and it works by creating a static vector representation of the data. Different language models have different levels of precision embedding. For example, you can have embeddings from 384 dimensions all the way to 3072.

In other words, in comparison to our cartesian coordinates in a map (e.g., [28.3772, 81.5707]) with only two dimensions, an embedded entry for an LLM has from 384 to 3072 dimensions.

Let’s Build

I hope this helped you better understand what those terms mean and the processes which encompass the term “AI”. This merely scratches the surface of complexity, though. We still need to talk about AI Agents and how all these approaches intertwine to create richer experiences. Perhaps we can do that in a later article — let me know in the comments if you’d like that!

Meanwhile, let me know your thoughts and what you build with this!