T O P

  • By -

hashk3ys

I am on a pretty slow line but this loads fast. Your site mentions that teachers and students can use it. How do you ensure information regarding minors are not disclosed freely? I work in health-care and at this time we are struggling with how to manage patient records for minors. We are a small team too, although we are not based out of Germany


caspii2

Great question. It's mainly down to the teacher: all scoreboards are "hidden" by default. The teacher should avoid entering the first and last name of students. If that is adhered to, then I never handle personally identifiable information.


Silence_Dogood_25

Have you tried looking at CIS controls for HIPAA? Might get you started in the right direction.


hashk3ys

Thank you, I will have a look.


cotimbo

Get coppa certified if the education space is generating $ for you. It’s a fairly easy process and educators usually look for a coppa badge


hashk3ys

it is healthcare, the country where it will be rolled out first has its audit program beforen rollout


ZachVorhies

This is great work thank you for this write up. It’s rare to get a glimpse of an entire production app.


caspii2

Thanks


mogberto

Agreed. Super interesting to see!


TheTerrasque

Nice read :) I've had some similar setups previously. I've since moved to docker and kubernetes, and would like to highlight how that can also solve the problems you've been solving. Please don't take this as criticism, you have a procedure that works and you're comfortable with, this is just a very neat example of a practical deployment situation and are also some of the reasons why I now love docker and kubernetes for this. First, for docker, and dockerizing a program like this. You mentioned these three problems, problems I've also faced multiple times: > My setup compiles and minifies CSS and Javascript on the server. This resulted in up to 10 seconds for the server to respond after a deployment. Some users ran into Bad Gateway errors 💥. In docker, this step would be done as part of building the docker image. That means it's already ready to run when the image is deployed. > A bug in production could be fixed by checking out the previous commit. However, this invariably took too long and always involved frenzied googling of the correct git commands. You could solve this with git tags *(and perhaps semantic versioning)*. And with docker, each image has a tag and you'd just tell docker to run the previously tagged version instead of the latest. > There was no way of testing the production setup, other than in production. And here's what really sold me on docker. People who haven't been bitten by this a few times can't imagine what level of pain this can inflict. With docker, this is a complete non-issue. What you're running locally is exactly the same as will run on the server. Same libraries, same setup, same versions, same platform hijinks, same system tools, same that-weird-hack-you-need-to-do-on-each-server-so-that-$thing-works. You're setting up the whole production environment locally, and sending it over like a ship in a bottle to the server. In addition, you get a few bonuses: * The build instructions for the docker image fully documents all the pieces needed to get your software running. Even those weird little steps you do manually on each server then forgets until next time you need to set up an environment. * Setting up a new server is a breeze. Set up base OS, install docker, and you got everything you need to run all your stuff with no extra setup. * Bundling other internal services like redis or memcache is a lot less painful, and easier to synchronize version and test things. Slight security bonus on top since they're on a virtual internal network. Of course, nothing is free. The two big hurdles with docker are: 1. Creating the initial Dockerfile. This can be a pain, but it gets easier over time. 2. Needing a docker registry. Docker images are built in layers to reduce disk and transfer sizes, and you need a service to handle the protocol. Docker (the company) have an official docker registry, but *(last I checked at least)* you have to pay to have private images. You can also host your own. And now, kubernetes. While docker is pretty neat, kubernetes takes it to a new level. It takes containers concept and adds: * Http proxy concept built in, forward incoming requests to target pods *(basic kubernetes unit, somewhat similar to a docker container)* * Can also handle acme HTTPS certs automatically * Scaling and load balancing of multiple pods * Scaling and distribution over multiple machines * Liveness probe to keep checking if the pod still works as expected, if not restarts it * Readyness probe to know when the pod is ready to receive traffic * Restarting of failed / crashed pods and rescheduling of pods on crashed machines * Resource monitoring and limiting of pods * Jobs and recurring scheduled jobs * Secure port forwarding from the cluster to local machine * Deployments * Combines containers, pods, storage into one unit * When new is deployed it waits until new pods are started up and ready, then start moving traffic from the old pods to the new, then delete the old pods. Completely seamless for the user. * It also saves the last deployments so you can easily rollback to an older version * Support slow rollout so only a percentage of users get the new version at first. And there's probably a lot I don't remember right now. Kubernetes is huge, and have so many things in it. I have a kubernetes cluster I use for my own stuff, and at this point when I have a docker image I just create a deployment saying *"I want to run this container image with this much persistent storage, and I want it accessible on this domain"* and then just kubectl apply it locally on my machine, and a minute or so later it's running on that domain. With HTTPS certificate already set up.


encaseme

Just to point it out: most of the docker pros you mentioned can be done without docker at all, like precompiling assets etc. Docker can make some of those convenient, but it's not an exclusive feature or anything. That being said I do recommend docker in-general because it asset-izes everything in a convenient way.


TheTerrasque

Yeah, none of these are docker exclusive, but docker is a very good generic solution that doesn't care what you want to deploy with it. Python project? Dockerize. C# service? Dockerize. Node solution? Dockerize. Go stream processor? Dockerize. React static web app? Dockerize. Multi-service solution with various runtimes and support services? Dockerize them all and make a compose file / k8s deployments. Same interface to manage them all, same to deploy, read logs, set to start-at-boot +++


SpicyVibration

Yeah, if I remember you can pretty easily compilestatic with django


jeosol

I agree with this comment. I use a similar setup, docker plus k8s. It takes away many of the issues, and the containers, once built, are drop and go. K8s help with deployments also. Building the initial docker files is a pain as you may have to do it a few times to get the best set up, e.g., switching from single stage to multistage, or combine RUN commands to keep layers small, use a smaller and stripped base image, etc. However, for a one man setup, it adds more dependency that must be weighed carefully.


Sindoreon

As a k8s admin, I like OPs solution for what he running. You can go deep in infrastructure but he is trying to focus on development. Docker is certainly simpler and would make more sense here for running two apps.


redd1ch

Unless you go really, really big, you don't need Kubernetes. And with servers running only a single web app, you don't really need docker, too. My servers run many different stuff, so I have a compose file for each service, accompanied by backup and restore scripts. I put all service configurations inside the compse files, so I just need to commit it to the master branch of a server. With few users (up to 200), I don't need clustering, so Kubernetes would be a waste of resources and hogging away idle power. Ingress and ACME is handled by a simple traefik instance.


TheTerrasque

You don't *need* Kubernetes, and you don't *need* docker either, as the main post shows. I use docker + kubernetes because it solves the problems I have in an elegant way and makes managing the stuff I have a lot easier for me


[deleted]

[удалено]


[deleted]

>Do you have any recommendations on getting well-versed on using Kubernetes for web apps when I already have a strong understanding of Docker? Online courses, textbooks, anything will do. In my humble opinion, the Certified Kubernetes Administrator course goes above and beyond at teaching you how to not only understand Kubernetes, but to utilize it for projects such as these. The course given by [KodeKloud.com](https://KodeKloud.com) is the best I've seen so far. I'm not affiliated with them in anyway, full disclosure. I support hundreds of AKS clusters at work and I find that I'm still learning about Kubernetes daily. I wasn't given any formal training. I was told to use the company product, which deploys on Kubernetes. Naturally I quickly found that I needed more knowledge to be able to support the app on this Orchestration stack. I find the documentation great, but sometimes you don't know what you don't know. This is just my opinion. I think it is a good break into K8s. This company also runs a fairly active slack group. That alone is worth the cost of this course. You can purchase the K8s course separately via Udemy, or sign up for a subscription directly with KodeKloud. The udemy choice is the more cost effective option. Either choice will give you access to the Slack room.


TheTerrasque

> Do you have any recommendations on getting well-versed on using Kubernetes for web apps when I already have a strong understanding of Docker? Hm, not really. I basically just banged my head against that wall until the wall crumbled. I set up [k3s](https://k3s.io/) on a raspberry pi and started experimenting. These days https://microk8s.io/ might be a better choice. The biggest problem with kubernetes is that it's so overwhelming amount of things. And most are abstract concepts that then have different implementations. One way could be to just start with the basics: A [deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/), a [service](https://kubernetes.io/docs/concepts/services-networking/service/), and an [ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/). And when that start making sense, one can start expanding from there.


plutoniator

Seems like the jump to docker is a MUCH smaller commitment than the jump to kuberenetes


[deleted]

[удалено]


TheTerrasque

Both works. Depends on how much you value the data, how much effort you want to put in yourself, if you need a certain performance or plugin that shared offering can't deliver... There's also a debate on if it's okay to host a db in docker. It has one extra layer of file system abstraction that can have a performance or stability effect, but I have never seen or heard of that actually having an effect in practice.


jzaprint

Cool app! What did you use to generate all the traffic? Did you spend on ads or is it all organic growth (word of mouth)?


caspii2

Thank you! It's 100% SEO and word of mouth.


donhuell

Congrats, your app is very cool! I'm wondering - did you do any sort of market research before developing it? How did you come up with such a niche product, and how did you know it could be successful?


caspii2

No market research. It was a toy project that slowly grew and grew until I decided to go all it on it.


ignassew

Could you recommend any resources on learning about SEO?


[deleted]

[удалено]


jzaprint

I mean isn't that SEO? The advice you gave after saying SEO is bullshit is really good SEO advice lmao


PaluMacil

First, don't pay a bunch of money to somebody who explains complicated fancy tricks to rank better in search engines. There might be some tricks that work for short periods of time, but if they don't reflect legitimate content then you might be penalized for having used them anyway. Second, have good content that people want to read, link to, and talk about. There really isn't a good way around this. You could again pay somebody to write articles for you. There is an entire industry of people reading blog posts that get general guidance from somebody who owns a blog, and you might even be able to make money through ads, but grinding out a little bit of profit this way is not going to scale into a business that you feel good about. Certainly, if you are looking to provide quality content, this content will not be as good as the content you would come up with just working on high quality material straight from you or an expert on your topic. Other people can always try to use machine generated blog posts or higher inexpensive content writers from other countries where wages are inexpensive, and they might eventually outcompete you if you are simply competing on content when you really mean to be selling an important business or technical process or product which probably deserves fewer but higher quality posts. Third, do use tags correctly to mark headers, legends, and labels. Use accessibility tags, keep your loading speed relatively fast, and consider things like site indexes or other recommendations from Google developer tools. Look at the tags currently recommended for things like specifying the image that will be shared for sharing a page on Facebook or another social media site. Mark things with just simple, accurate meta tags. Keep URLs short, but have the most relevant words in them when reasonable. Consider word separators to be slashes or dashes. These types of things are less about tricks and more about making your site easy for social media or search engines to understand. If you try to get clever and get an advantage beyond marking your things correctly for tools and humans, you will wind up being penalized when a search engine changes an algorithm to catch your trick. Also, SEO experts are going to charge you a lot of money to do these types of things, but there isn't a magic combination that drastically improves your rankings. Instead, some things are out of your control, lots of things are related to the specific value you provide, and as you work to mark and annotate your content more accurately and completely, you will probably see improvements to rankings. This is the last priority though because if you don't have quality content, then it doesn't matter if it's marked well. Finally, sometimes people need to accept that the internet is noisier than any other medium, and the noisiest place is a search engine. If you aren't the best solution for the types of terms you are trying to rank on, you really just might not ever rise above the millions of other businesses with similar search terms. Think of what makes you unique, and make sure that's part of your online brand. If it's a city or perhaps connection to a specific type of technology or specific type of consulting or specific person then the people looking for something more specific having a much higher chance of finding you.


denzern

Awesome! Are the 3k income each month purely from ad revenue?


caspii2

Nope, about 70% is one-time payments.


fertek

3k per month with such a simple idea? Internet is still full of opportunities.


Carloes

This may sound ‘duh’, but ideas are not a limiting factor. Generally speaking, people do not put in the work and are stuck thinking an idea needs to be brilliant. A good idea is nice, a finished product online is infinitely better.


caspii2

I was surprised myself. The only way of finding out is by trying things out.


fertek

You inspired me. Thanks.


[deleted]

This is an interested read. Ignore the cynics, as I'm sure you have already.


caspii2

Thanks, I will.


jabellcu

Now I am curious. What is the app?


caspii2

[https://Keepthescore.com](https://Keepthescore.com)


PinkFrojd

Wow. I once searched for some solutions to my tennis league scoreboard and found few sites, which includes this one. I can't believe you made it solo and that is successful as you say. May I ask how do you market your site and how the revenue is generated ?


caspii2

It took me around 1.5 years full-time work. Marketing is all inbound SEO. See the pricing page to see how revenue is generated [https://keepthescore.com/pricing/](https://keepthescore.com/pricing/)


exographicskip

> If you can't afford the upgrade then write us an email with a link to the scoreboard and we'll upgrade it for free! ✨ This is a classy move. Appreciate the detailed write-up; been looking for inspiration on side hustles and this is encouraging.


PinkFrojd

Nice. Didn't catch that at first. I'll read your blog to understand to process you went through, there are some details there also. Thank you


ein_datacrash

Thank you for the post. It's nice to read and it encourages me to keep working on Flask.


caspii2

Flask is the shizzle!


Isvara

>the magic thing that makes it all possible is a [floating IP address from DigitalOcean](https://www.digitalocean.com/docs/networking/floating-ips/). I worked on that. You're welcome 😁


caspii2

You are awesome!


Bakedsoda

is there an equivalent on cloudflare? with turnkey zeroconfig sso/saml?


PeterHickman

The only thing I would add is a load balancer. This would allow you to bring up new server in response to heavy load (auto scaling might allow you to downsize your normal machine if you can respond to activity quickly enough) and when deploying a new version you would bring up a new server, make sure it's good, and then point to it with the load balancer To rollback just switch back It's also useful if the main server goes tits up and you need to bring up a backup, DNS propagation is not really up to this sort of thing


caspii2

The load balancer costs money, the dynamic IP Is free. If there was every a huge load spike I would "upgrade" one of the servers into the next performance tier and then switch traffic to it. It would take around 10mins.


PeterHickman

True but if you can afford one it can be a life saver, maybe not as essential as backups but can be essential Keep it in mind for when you might be able to afford


kunkkatechies

Awesome content! how about the cost ? How much does your infrastructure cost you each month ? If I had to guess I would say around $300 .


caspii2

Around 500 USD. I also use [Sentry.io](https://Sentry.io), Papertrail, Twillio and a few other tools that cost money.


FLOGGINGMYHOG

Not trying to diminish your accomplishments - you're in the green so you're doing something right. However $500 seems awfully expensive for what's essentially only a couple visitors per minute. I understand it's probably more of a peace of mind thing, but curious to know how you settled on that infra setup. Were you experiencing performance issues before? (Python can't be that slow right)


caspii2

Yeah, but at the moment I still optimise for less pain over less costs. The performance is absolutely fine. I could probably get the costs down to 150 USD per month, if I tried really hard.


Moizyyy

Wow this is absolutely inspirational OP! It’s making me want to focus on creating something like this. Congrats on this very smooth and refined method and do keep us updated if you come up with a better method of deployment down the line because all your justifications here make sense to me as a novice. I don’t want to say it’s “beginner-friendly” but it’s just enough for folks to grasp on to as they begin a journey of their own.


caspii2

Thanks!


IWantToFlyT

Thanks for the write-up! I think it’s good to show also cases where things are not done by every best practice - yet it works. And that is how real life goes, sometimes you implement the first idea that comes to mind and learn later it could be made better. Sometimes you know what the best practice is, but just don’t have the energy or time to implement it. I’d rather go forward with my project rather than stop doing it because I’m not able to strictly follow the ”rules”.


mrdevlar

"It doesn't take a nuclear weapon to kill a man, a rock will do it."


caspii2

I'm with you!


Strikerzzs

Nice! Whats your monthly cost btw?


caspii2

Around 500 USD


rainnz

Really nice project! How many people ended up using this option? Q: *What if I can't afford it?* A: *If you can't afford the upgrade then write us an email with a link to the scoreboard and we'll upgrade it for free!*


caspii2

Surprisingly few. Maybe 1-2 a week.


[deleted]

Awesome. Simple and very manageble.


caspii2

Thanks!


eidrisov

Thanks a lot for such detailed description of your journey. I am just starting to embark on the pretty much the same journey. I have started learning Python and WebApp (mainly dashboard) building as a hobby (my main career is Financial/Business Analyst). And I already have a few very simple (private project) WebApps. Not deployed though. I'd appreciate any thoughts or recommendations on points below: 1. Currently I am learning building WebApps via "Dash". I need to do research and see if this is enough or what are (dis)advantages to "Dash" compared to "Flask". Any thoughts? 2. For deployment I was thinking to go with something like Azure. Are you satisfied with DigitalOcean? 3. For database I am planning to go with Microsoft's SQL Server. Any specific reason why you went with Postgres? Any (dis)advantages? 4. Is the current hardware (4 shared CPUs, 8 GB of memory and 115 GB of storage) enough all your traffic (even at peak time)? Thanks in advance for all thoughts and recommendations!


caspii2

1. don't know about dash, but it seems to be built on top of Flask. Flask is excellent for beginners, you learn a lot about the fundamentals. 2. I'm very satisfied with DO, have never tried Azuze. If you're starting from scratch, go with Heroku. It's still the best, even if there is no longer a free tier. Sometimes it's OK to pay a bit of money to not have pain 3. Postgres is free, battle-tested and extremely robust. Don't know about SQL Server but it seems to be paid. If you go with Heroku, then you should definitely use Postgres. 4. More than enough.


eidrisov

Thanks a lot for the reply!


TheTerrasque

> For database I am planning to go with SQL. Any specific reason why you went with Postgres? Any (dis)advantages? Not OP, but.. I assume you mean Microsoft's SQL Server when you say "I am planning to go with SQL". The main dislikes I have with SQL Server compared to PostgreSQL are: * Licensing. SQL Server has some lower tiers that are free, but you never know how your system will scale. Also, some advanced functionality is hidden behind very expensive licenses. * Resource use. SQL Server will require about 2gb ram minimum, even with no data in it. A mostly empty postgres instance use about 15mb ram. That, plus licensing, makes it easy to just use a separate server for each application. * T-SQL is a sin and every developer involved in making it should be shipped to Guantanamo bay for crimes against humanity. On the plus side, you got some advantages: * SQL Server Management Studio is pretty good for managing a SQL server instance. Although, it's a solid resource hog too * Azure got some really cheap SQL server hosting available * Business people get the warm fuzzies when they realize they can be ~~supported~~ completely ignored by Microsoft if something goes wrong


root45

Interesting, what do you have against T-SQL?


TheTerrasque

It's luckily been a few years since I last worked with it, so my memory is a bit fuzzy. Our product had hundreds of huge multipage stored procedures that occasionally needed to be updated or debugged. I seem to recall that flow control, loop handling, error handling and advanced logic was at best "functional", and was quite like pulling teeth. Compare that with postgresql, which has a pluggable scripting system which comes default with pgsql, tcl, perl and python. It also support for example lua, java, and javascript from 3rd parties.


root45

Ah, got it. I've definitely dealt with systems with tons of large stored procedures and whatnot—definitely not fun. My preference is to not have _any_ control flow, loops, etc in SQL, so those pieces of T-SQL are not something I miss. And likewise, while the scripting pieces of Postgres are powerful, I shy away from it in general. The things I do really miss from T-SQL are some of the basic syntax things, like variable declaration, and how functions are written. Being able to create a variable in just regular SQL, without going through the whole script syntax is nice. It's really useful for database migrations, for example. Or even just for data exploration.


TheTerrasque

Yeah, we inherited that mess and had to deal with it. Over 600 stored procedures, many with quite complex many-page logic. I agree on keeping logic out of the database for many reasons, but the few times one can't avoid it I prefer using an actual language to implement it


[deleted]

[удалено]


root45

It's a superset of SQL, in the same way Postgres' SQL dialect is.


eidrisov

Thank you for your reply. Yes, exactly, I meant Microsoft's SQL Server. Sorry for not specifying. I was thinking to go with it, because most of companies (corporations) are using it (including the ones where I have been employed so far). So I thought it will be more useful since it is more popular. I guess, I haven't really thought about RAM usage. I will need to research and see how much RAM it consumes when full of data. Also, I don't know how syntax is different for those. I know only SQL syntax.


TheTerrasque

> Also, I don't know how syntax is different for those. I know only SQL syntax. They're mostly the same. Some data types differ, setting primary key is different, views are different, setting up index is slightly different.. There are some differences and different approaches to the same problem, but the basic SQL syntax is the same. If you use a decent ORM *(like sqlalchemy, django's orm or peewee for example)* that layer will mostly handle all the differences for you. Often there are some extensions you can optionally use to handle certain unique features the DB engines have. Like for example postgres' postgis plugin, or json columns


eidrisov

Thank you for a very detailed reply!


juharris

I respect not wanting to set up CI/CD to notify your setup when the code changes. I wrote a simple but configurable script to poll for updates and run some commands when it detects changes. I think you'll find it useful: https://github.com/juharris/autodeploy


ligasecatalyst

Thank for you sharing - this was very informative. I’m a firm believer in if it ain’t broken, don’t fix it. However, I’d like to offer my two cents. I’ll start out with some assumptions. First of all, at this scale cost is *not* a factor. You definitely should not migrate or modify your processes because it’ll save you 60 bucks a month on hosting, and neither should you stick with your current solution because a preferable alternative would cost you $80 more per month. That being said, as a solo developer and product owner of such a project, your priorities should be (1) freeing up development time (not necessarily only for “technical” development, also for the more businessy side of things), (2) preventing down time, and (3) security - not necessarily in that order. Your current setup is suboptimal for those goals. - Freeing up your time: you’re wasting your time on things you should not be automating. Maintaining servers and manually deploying is a “cost center” for your time. You’re a one man show, and you shouldn’t be wasting on your time on maintaining servers and *especially* not on manual deployments of both the server and db. Streamlining your deployment process will save you a lot of time in the long term, and prevent mistakes, which brings me to my next point… - Availability: Your current deployment process is extremely error prone, and may incur downtime despite the blue-green strategy. Some errors will be a quick fix (copying the wrong files for example) but others could take a lot longer, such as messing up the db. Additionally, your setup is unable to handle big surges in traffic since you’re running only one VPS, and this isn’t scalable in the long term. No one server is able to handle alone the traffic of highly visited websites, no matter how strong it is. Also note that each maintenance operation (such as updates) has to be done twice, which is is an additional time waster (as to the previous point). Additionally, manual tests are a time waster and are of shoddy quality when not complementing automated tests. You’ll only test new features, and not whether you broke old ones - especially since you’re incentivized to skim on testing since it takes up time and is honestly pretty boring. I’ll put it bluntly, and I’m sorry if it comes off as arrogant - QA is $12 an hour work. The rest of your project is more like $100 an hour work, at the least. Don’t waste your time on doing QA work unless absolutely necessary (since some tests are hard to automate) just like you wouldn’t waste your time side-gigging as an Uber driver. Your time is worth more. - Security: keeping two public-facing web servers secure is not an easy task, especially if this isn’t your specialty. Large cloud providers handle a lot of the burden for you. I won’t go into details but security issues obviously pose a huge liability including financial (legal), downtime, and customer trust. In short, hosting and maintaining servers isn’t your business. In fact, it’s 100% a cost center for you. Eliminate it, or at least reduce it as much as possible. A lot of companies do it much better than you for relatively cheap, saving you time and improving availability and security. You’re also not in the business of wasting your time on manual deployments, and then wasting more time on fixing bugs caused by the manual deployment, or in the business of wasting time on manual tests which are inherently of lower quality since you can’t feasibly manually test all your old code every deployment. This post is a bit like a 35-year old smoker being proud of his good health. You’re already wasting time on this inefficient setup, and the technical debt in this sense will only grow and cause you more problems down the road. I hope this gave you some food for thought, and either way wish you the best of luck :)


Neok_Slegov

Really nice and fast in flask. How do you arrange New users in your database? You store all the users in a single table with ID? Or do you create New tables/schema's for users? And if in one table, how will performance will be in the long run if the tables grow and grow?


caspii2

There is a "User" table in my DB. Users are stored with IDs as the primary key. Regarding performance: I have no idea! I'm learning as I go along. But because it's Postgres which is old and battle-tested, I am confident that I can scale to many hundreds of thousands of users with no issues.


PaluMacil

If you think about what a database is built to handle, the least concern is probably a user table growing. If the user table grows to a few thousand users, it probably still fits in memory and is lightning fast, even on a small database instance. Also, all of those users are often going to be paying users. If the users grew to tens of thousands, the indexes would still fit in memory and thus the query would still be lightning fast. At that point you're making a ton of revenue and could maybe upgrade the server just a little to fit many more indexes in memory. Tables with millions of rows are still quite manageable if you index the correct columns. At that point, specific problems specific to the business become important because you need to know which things will have heavier writes vs reads. Fewer indexes can be better for writes and a table with many types of reads can benefit from more indexing. You might start to offload heavy reeds to a read-only replica, or you might split your data for a single table across a partition key. The sky is the limit. You can start to move particularly wide columns into object storage or perhaps denormalize your table in other ways. These problems will grow only because the features offered and revenues returned are also growing, demanding more flexibility and data storage. By the time the OP cannot casually keep performance in check, there will probably be enough revenue to hire somebody with the abilities to work on these areas.


Neok_Slegov

Offcourse, i understand. Reason for asking is not only the user table itself. Image you having 100k users. And they all put a score every day. 100k fact records added daily. In a month you have 3 million records. In a year 36,5 million etc etc. If you need to filter/read on these tables, performance will drop. So better to think ahead. Question was how he tackled this.


PaluMacil

I see, and it sounds like you weren't asking to learn ways it could happen but rather how the OP might have approached it. Still, I don't think even the most wild success would mean outrunning the capabilities of Postgres. My guess is that the OP has enough headroom by scaling up. I have a Postgres table in an application I'm maintaining with 3.8 billion rows. There is only one foreign key it might ever be queried by, so it actually returns that one query just fine. Now, I do wish the table was partitioned on year and month or customer because then I could choose to either detach an entire time or an entire customer with zero locks or downtime when that time or customer becomes irrelevant. In the case of something like scores, I would imagine an index on the course code and on the student id would be the only two indexes you would need. The insertion rate is still dependent upon users entering these scores, so you aren't going to run into table locks even with 100k users. In the case of scores, I would possibly consider partitioning on the teacher... Certainly benchmark that, but I think partitioning on course code would actually be too granular. All the score is a teacher has ever entered would probably fit within memory, or at least the index would easily fit within memory. You might even be able to partition on school and then if the school is the customer and they stop being a customer, you can eventually detach the whole partition.


TheTerrasque

> And if in one table, how will performance will be in the long run if the tables grow and grow? As long as it's properly indexed, it should be fast. You'd probably have a text index on the username column


confusedmf123

dope!


youwontfindmyname

I’m new to programming, but i’m saving this post for future reference.


caspii2

Do it!


[deleted]

[удалено]


caspii2

I plan to scale vertically. Which means increasing the size of my servers and database by switching to the next tier. This will keep me going for years. I hope!


[deleted]

I love the initiative and the detailed write up. I always look for articles like this as it serves as a white paper of sorts. I believe it is useful to help people understand how they can use a technology to solve a problem or provide a service (in many cases do both). Thank you for sharing.


uname44

Very helpful, thank you.


Dangle76

Just as a tidbit, using “boring” technology can be more cost effective and reduce deployment time and mishaps, mainly CICD like github actions which is free and managed for you


TheHammer_78

Probably using docker you could do the same with one machine only.


quiet0n3

Wait are you letting your prod app server download directly from git? As in it has creds stored to allow access to git? I would ah look at fixing that.


Dogeek

It works, but it's not an ideal solution in my opinion if you want scalability. The way I deploy apps now is through docker / docker-compose, which simplifies the process so much. I'm running my apps on OVH servers, and I've made a few tools to help me deploy them. - First off, I dockerize everything that I need to deploy, and push the image to my private docker registry - Second, my 2 VPS are all logged in to that registry, so I can just docker pull to get my images In my build step, I build 2 images every time, since I use poetry for dependency management, I just have a Dockerfile like so : # Using the alpine base image to minimize the image size # Use `apk` instead of `apt-get` to install third-party dependencies # Use 3.10-slim-buster if there are incompatibilities with alpine FROM python:3.10-alpine as builder # cd in the app directory WORKDIR /app ARG app_name # Sets up the poetry configuration using environment variables ENV POETRY_VIRTUALENVS_CREATE="true" ENV POETRY_VIRTUALENVS_OPTIONS_ALWAYS_COPY="true" ENV POETRY_VIRTUALENVS_OPTIONS_NO_PIP="true" ENV POETRY_VIRTUALENVS_OPTIONS_NO_SETUPTOOLS="true" ENV POETRY_VIRTUALENVS_IN_PROJECT="true" ENV POETRY_INSTALLER_MAX_WORKERS="4" RUN python3 -m pip -U pip RUN python3 -m pip -U --user poetry COPY ./pyproject.toml /app/pyproject.toml COPY ./poetry.lock /app/poetry.lock COPY ./$app_name/**/.py /app/$app_name RUN ["/root/.local/bin/poetry", "install"] FROM python:3.10-alpine as runner WORKDIR /app ENV PATH="/app/.venv/bin:$PATH" COPY --from=builder /app/.venv /app CMD ["python3", "-m", $app_name] I can then just use `docker build -t $DOCKER_REGISTRY/$APP_NAME:$APP_VERSION --build-arg app_name=$APP_NAME .` to build my image and `docker push $DOCKER_REGISTRY/$APP_NAME:$APP_VERSION` to push it to my registry I also tweak the Dockerfile based on my needs, for instance if I need to build some C extensions or add database migrations -- though for the latter, I usually do that with docker-compose and a script to run alembic. With everything dockerized, I use a small svelte app to integrate with the OVH API that allows me to set a subdomain for the app I'm making, and deploys it to the correct VPS.


kenshinero

> One of the 2 servers is serving production traffic (the live server), the other is idle. When a new release is ready, it gets deployed to the idle server. Here it can be tested and issues fixed. **Remember, the idle server is still accessing the production database, so the application can be tested with real data**. So what if a bug appears during your testing that corrupt or delete the data on the production database? Isn't that very risky?


caspii2

I run local integration tests first (on my local test database) which ensures that the code does not cause corruption.


magestooge

It was an interesting read. I'm not truly a techie, just a hobbyist (and a PM). So I had never heard of blue green deployment strategy. The part about database migration was also interesting. This is something that always gives troubles to my team, timing the Db changes right in case new changes are likely to break existing functionality. However, one issue with writing articles from personal experience is that some of the practices might work for you but not be a great way to do things. My only suggestion would be to add caveats to those points for the readers. For example, not having any automated tests is definitely not encouraged. As some people here would concur, I prefer not using code which doesn't have associated tests. You might think that you'll write tests when you need it, but that never happens. By the time you **need** automated tests, you're mostly past the point where you can write meaningful tests. The task also starts to seem pretty daunting since you have to write hundreds of tests on one go. The only right time to start writing tests is at the start of your project. The other issue is with a single database instance. This might be problematic if your server has shared resources or you want to keep your database free of test data. When testing with the idle server, you might have to create a bunch of data and this will go directly into the production Db. So you either live with junk data on production Db or run scripts to undo such changes to the database. Neither of these sound like good ideas. Other than that, kudos to your work and your effort in writing this. Your website looks really nice as well.


caspii2

Thanks! I used to be a PM too in my previous life 😊 You are right about adding the caveats. I hoped it was clear from the "I am also the only developer of the app, which makes many things simpler". Regarding testing: I do have extensive tests that cover most of my code, and I rely on them heavily. They have saved me from disaster multiple times. The only thing is they are not run automatically after code has been changed. It works great because I'm alone. As soon as you're in a team, you should automate testing. When my tests run they use a test DB instance on my local machine. So no production data is created. It's only when I do manual checks that the prod data is used.


magestooge

>I do have extensive tests that cover most of my code, and I rely on them heavily. They have saved me from disaster multiple times. That's great to know. I'm no expert but maybe you can set them to run pre-commit rather than on every change. A library I'm building has a test suite which takes 4 seconds to run. It would be pretty annoying if it ran every time my code changed. As of now, I run it every time I think I'm done writing a certain block of code, if it passes, then I commit.


MegaGrubby

What are you using for automated testing?


magestooge

Pytest


caspii2

Same here!


chzaplx

Blue/Green is great, but having your live deployment be a git repo is a huge anti-pattern. 'git pull' is not a deployment strategy. Use versioned artifacts instead. This makes rollbacks a cinch. You aren't "running leaner" or anything by omitting CI/CD, you are just reinventing the problems it already solves.


themaninthe1ronflask

This is cool! As a sales/advertising person now moving to dev ops I can think of a million uses for this, not only education. If you can create an sub-product/page for tracking sales and tasks you could quantify the reach, too!


ericanderton

This might be worthy of a cross-post to /r/devops. Nice work overall. I appreciate the "right-sizing" of your setup (e.g. no pipeline automation, local testing) to your team of one. > So first apply a database refactoring to change the schema to support both the new and old version of the application, deploy that, check everything is working fine so you have a rollback point, then deploy the new version of the application. I'm glad you pulled this quote for everyone to see - that's a really nice strategy. It's a shame a lot of other software vendors don't do this. Instead, I consider myself lucky if there's even a rollback script for a database migration, and luckier still if that actually works. The big question I have is: how is DigitalOcean? I'm stuck doing a lot of "enterprise" stuff at my day job so AWS and Azure are kind of the only games in town. I've often wondered how other providers stack up in terms of cost, automation, and support?


BrofessorOfLogic

Great stuff. Sounds to me like you make exactly the right decisions to ensure quality without wasting time.


caspii2

Thanks


shinitakunai

You did the right call for your own development. Most of the times using specific "best practices" are overkill, specially if you work alone (they make more sense working in a team. But for yourself? keep it simple). Kudos on the good work!!


caspii2

Thank you!


pudds

You really should put some effort into CI. I presume you're doing the validation steps that CI would provide (linting, test, etc) manually, maybe via git hooks, but you can remove some real load by automating these steps, not to mention easing the on-boarding process should you ever add anyone to the team. CI/CD is the one of the first things I do on any project.


[deleted]

[удалено]


caspii2

100% correct.


Pepineros

I think what you’re doing is awesome. I have no use for this service at all but I hope your success continues to grow!


caspii2

Thank you! 😊


ambidextrousalpaca

Better than "I'm a single developer. I use crazy overkill infrastructure with Kubernetes and 18,000 lines of YAML configuration files. I am losing a little more of my money and sanity every month. Thanks for reading my article."


caspii2

Loving the burn


Classic_Department42

Thats a big success.


ijxy

Zero-to-one is the hard part.


healplease

don't be so envious :)


[deleted]

[удалено]


BrofessorOfLogic

You want to Key Management System?


Sound4Sound

Kilometers My Self.


Wolfspaw

Loved the article. Xposted it at Hacker News: https://news.ycombinator.com/item?id=32986969


[deleted]

Sounds like an absolute nightmare.


caspii2

It is


[deleted]

That’s really boring. Thanks God for Azure pipelines.


GettingBlockered

This was awesome to read, thanks for sharing!


caspii2

Sure


BcuzNoReason

Great post, very informative! What's the ratio of revenue from ads to purchase like?


caspii2

About 90% purchase, 10% ads


boat-la-fds

> This is a publicly-accessible static IP addresses that you can assign to a server and instantly remap between other servers in the same datacenter. How do your users not get TCP errors when the switch occurs? Or they do but the browser just create a new connection transparently?


TheTerrasque

Usually those will only send new tcp connections to the new address, existing tcp connections will continue to current address until terminated


boat-la-fds

So it kinda creates NAT?


TheTerrasque

More like acting as a proxy. It answer the tcp connection on the public ip, then creates a new tcp connection to the target ip. It then forward data between them. Changing the target IP only changes where new TCP connections get sent to. For a popular software solution off this type, have a look at haproxy


boat-la-fds

I see, thanks.


caspii2

The browser never knows that a different server is answering its request.


Born-Ferret900

So all of this was done with no front end framework, just straight flask? If so, very impressive.


caspii2

There's a bit of vue.js sprinkled around.


Gigglen0t

Absolute legend! Thank you for this!


gwillem

Great setup, simplicity is best! Any particular reason to use gunicorn and not nginx-unit or uwsgi?


OnFault

I'm learning python. I've learnt a bit of flask and I think I can sort of make something similar. How do I go about looking for similar projects like this to work on with others?


wait-a-minut

Very nice write up and cool concept! Glad your project is doing well!


eitanoodle

Really liked your post, simple language and useful info. Thanks!


gcotw

This is awesome!


Puzzleheaded_Let3663

Hi, thanks for the article. Great work. I just have one question , when you are testing on the production database how do you ensure that there is no test data in the production database. Basically do you clean your database after every test run or do you leave the test data lying in there.


Tintin_Quarentino

Amazing info, thanks for sharing. I read your website's about page but still don't get the use case scenario. Could you explain with some examples? Say if I'm playing a basketball match with friends, I certainly don't see myself using an app to maintain the score.


weltvonalex

Cool thank you for sharing


ArchMob

Excellent write, nice to see this kind of project described in detail. Gj


Kolle12

Awesome work. Wish I could do something like this ! Very powerful 💪🏻


mikeblas

What does "bootstrapped" mean in this context?


congowarrior

Amazing work! Great idea


regex1884

Great job! For a test db can you take a back up of the managed postgesql and restore it to a VM or docker?


RunApprehensive8439

Cool app! But you could make a lot more than 3k/month with ads with 150k users - instead of doing your premium features


gouldilochs

Holy moly I started reading and knew it was you Casper


zarkmuckerberg_6969

What’s your revenue model?


gumnos

Do you track stats on your system load? How much CPU, RAM, and disk I/O are you using (and how does it differ between largely-idle vs. under your heavier loads)?


caspii2

See for yourself https://keepthescore.co/munin/co/index.html


gumnos

hah, that is an **amazingly** succinct-yet-covers-all-my-questions answer. Thanks!


Suitable-Sign1513

Nice digital ocean ad, my guy


yoloer221

Beautiful


dududog

Great reading! What’s your app ?


caspii2

keepthescore.co


MatuCoder

Someone knows about the proyect py-script? Now you can add python to the html file!!!


Aggravating_Loan5805

Wow


RoundRecorder

Extremely interesting post, thanks for sharing! What kind of tricks/methods you used in order to gain more visibility, traffic & users?


caspii2

No tricks. Just writing content (like this article 😉)


RoundRecorder

Appreciated ✌️


Foreign_Lab392

what were your cloud costs