2026-01-31 22:22:22
I am still learning my way around the command line. Many times I know what I want to do, but not how to express it as a shell command.
This challenge gave me a chance to explore GitHub Copilot CLI as a bridge between natural language and terminal commands.
I created a small helper workflow where I:
What I Leaned
Example:
Task
count number of folders
Copilot Suggestion
bash
find . -maxdepth 1 -type d | wc -l
What I learned
Copilot CLI is best used interactively, with a human in the loop
It is very helpful for beginners who understand goals but not syntax
Understanding CLI limitations is as important as using AI tools
This experience helped me become more confident with terminal commands instead of blindly copy-pasting them.
This post is my submission for the GitHub Copilot CLI Challenge.
2026-01-31 22:21:21
The software industry moves fast.
Frameworks change.
Trends rotate.
Roadmaps expand faster than teams can think.
But speed is not always progress.
Over time, I’ve noticed something counterintuitive:
the most durable products are rarely the fastest to ship.
They are the ones that slow down early.
Slower software doesn’t mean less ambition.
It means more intention.
It means asking:
• Why does this exist?
• What problem truly matters here?
• What can we deliberately leave out?
When everything is optimized for velocity, clarity is usually the first casualty.
Features accumulate.
Interfaces grow louder.
Decisions get rushed.
Slowness creates space.
Space to think.
Space to remove instead of add.
Space to design systems that age well.
This mindset is what I’m exploring through AVESIRA:
not as a product launch,
but as a way of thinking about software, design, and presence.
There’s no rush to scale.
No pressure to impress.
Just a commitment to build things that remain understandable over time.
Sometimes the most productive thing a team can do
is pause.
Slow down.
And choose deliberately.
If this resonates, you’re already moving fast enough.
2026-01-31 22:21:05
Originally published on LeetCopilot Blog
LeetCode offers 4,000+ practice problems. Educative teaches patterns and system design. Here's how to choose between practice volume and structured learning.
LeetCode and Educative take completely different approaches to coding interview prep.
LeetCode is a massive problem bank with 4,000+ coding challenges. It's where you grind, build speed, and practice solving problems under pressure.
Educative is a structured learning platform with courses like "Grokking the Coding Interview." It teaches you why solutions work through patterns and concepts.
Which is better? The answer: use both in the right order. But if you can only choose one, here's how to decide.
| Feature | LeetCode | Educative |
|---|---|---|
| Purpose | Practice problems | Learn patterns/concepts |
| Content | 4,000+ problems | 600+ courses |
| Approach | Hands-on grinding | Structured learning |
| System Design | Limited | Excellent (Grokking) |
| Format | Code editor + problems | Interactive text + coding |
| Free Tier | Extensive | Limited |
| Pricing | ~$35/mo or $159/yr | ~$59/mo or $299/yr |
| Best For | Practice + speed | Learning + concepts |
LeetCode is the industry-standard platform for coding interview practice, used by millions of engineers preparing for tech interviews.
Educative is a structured learning platform with interactive, text-based courses covering programming, system design, and interview prep.
| LeetCode | Educative | |
|---|---|---|
| Core Purpose | Practice problems | Learn concepts |
| Philosophy | "Solve problems to learn" | "Learn patterns to solve problems" |
| Outcome | Speed + pattern recognition via repetition | Deep understanding + transferable skills |
LeetCode's approach: Throw yourself into problems. Learn by doing. Build muscle memory.
Educative's approach: Learn the underlying patterns first. Then apply them systematically.
"I did 200 LeetCode problems but still failed interviews. Educative's Grokking course finally made DP click." — Reddit user
| LeetCode | Educative | |
|---|---|---|
| DSA Problems | 4,000+ | 100s (in courses) |
| System Design | Limited | Excellent (Grokking SD) |
| Behavioral | None | Yes |
| Language Learning | No | Yes (Python, Java, etc.) |
LeetCode wins for raw DSA practice volume.
Educative wins for comprehensive interview prep including system design, behavioral, and concepts.
| LeetCode | Educative | |
|---|---|---|
| Format | Problems + community solutions | Interactive text courses |
| Guidance | Minimal (self-directed) | High (structured paths) |
| Explanations | Community-driven (variable quality) | Professional (consistent) |
LeetCode is better for self-directed learners who want to dive into problems immediately.
Educative is better for those who need structured guidance and thorough explanations.
| LeetCode | Educative | |
|---|---|---|
| System Design Content | Limited articles | Comprehensive courses |
| Quality | Basic | Industry-leading |
| Courses | None | Grokking System Design |
Educative dominates for system design. If you're interviewing for senior roles, Educative's system design courses are essential.
| Plan | LeetCode | Educative |
|---|---|---|
| Free | Most problems | Limited samples |
| Monthly | ~$35/mo | ~$59/mo |
| Annual | ~$159/year | ~$299/year |
| Lifetime | N/A | Sometimes available |
LeetCode is cheaper overall. But Educative covers more ground (system design, courses).
Choose LeetCode if you:
Recommended Path:
Choose Educative if you:
Recommended Path:
The best approach combines both platforms:
Educative First (2-4 weeks)
LeetCode Second (Ongoing)
System Design (If Senior)
| Feature | LeetCode | Educative |
|---|---|---|
| Purpose | Practice | Learn |
| Problems | 4,000+ | 100s (in courses) |
| System Design | Limited | Excellent |
| Structure | None | High |
| Pricing | $159/yr | $299/yr |
| Best For | Practice volume | Conceptual learning |
Should I use LeetCode or Educative first?
Start with Educative to learn patterns, then practice on LeetCode.
Is Educative worth $299/year?
For learning patterns and system design, yes. For practice only, no—use LeetCode.
Can I pass FAANG interviews with just LeetCode?
Yes, but you risk memorizing rather than understanding. Combine with pattern learning.
Can I pass FAANG interviews with just Educative?
Educative teaches well but has less practice volume. You'll likely need LeetCode too.
Which is better for system design?
Educative, by far. Its Grokking System Design is the industry standard.
Good luck with your prep!
If you're looking for an AI assistant to help you master LeetCode patterns and prepare for coding interviews, check out LeetCopilot.
2026-01-31 22:20:35
Original Japanese article: S3トリガー×AWS Lambda×Glue Python Shellの起動パターン整理
I'm Aki, an AWS Community Builder (@jitepengin).
In my previous articles, I introduced lightweight ETL using Glue Python Shell.
In this article, I’ll organize two patterns for triggering Glue Python Shell when a file is placed in S3, and explain the reasoning behind calling Glue via Lambda.
While it is possible to implement ETL processing entirely within Lambda triggered by S3 events, there are limitations in runtime and memory.
By using Lambda as a trigger and delegating lightweight preprocessing or integration with other services to Lambda while executing the main ETL in Glue Python Shell, you can achieve flexible service integration and long-running processing.
For more details on when to use Lambda vs Glue Python Shell, check out my previous article:
AWS Lambda and AWS Glue Python Shell in the Context of Lightweight ETL
The two patterns covered in this article are:
start_job_run (Direct Job Execution)start_workflow_run (Workflow Execution)Other patterns exist, such as S3 → EventBridge → Step Functions → Glue Python Shell, but we’ll focus on these two simpler approaches.
start_job_run (Direct Job Execution)
In this pattern, Lambda receives the S3 file placement event and directly triggers a Glue Python Shell job using start_job_run.
This was the setup used in my previous article.
Characteristics:
Set the target Job name and parameters for start_job_run. Here we pass the S3 file path.
import boto3
def lambda_handler(event, context):
glue = boto3.client("glue")
s3_bucket = event['Records'][0]['s3']['bucket']['name']
s3_object_key = event['Records'][0]['s3']['object']['key']
s3_input_path = f"s3://{s3_bucket}/{s3_object_key}"
response = glue.start_job_run(
JobName="YOUR_TARGET_JOB_NAME",
Arguments={
"--s3_input": s3_input_path
}
)
print(f"Glue Job started: {response['JobRunId']}")
return response
import boto3
import sys
import os
from awsglue.utils import getResolvedOptions
def get_job_parameters():
try:
required_args = ['s3_input']
args = getResolvedOptions(sys.argv, required_args)
s3_file_path = args['s3_input']
print(f"s3_input: {s3_file_path}")
return s3_file_path
except Exception as e:
print(f"parameters error: {e}")
raise
def _to_pyarrow_table(result):
"""
Compatibility helper to extract a pyarrow.Table from a chDB query_result.
"""
import chdb
if hasattr(chdb, "to_arrowTable"):
return chdb.to_arrowTable(result)
if hasattr(result, "to_pyarrow"):
return result.to_pyarrow()
if hasattr(result, "to_arrow"):
return result.to_arrow()
raise RuntimeError(
"Cannot convert chdb query_result to pyarrow.Table. "
f"Available attributes: {sorted(dir(result))[:200]}"
)
def normalize_arrow_for_iceberg(table):
"""
Normalize Arrow schema for Iceberg compatibility.
- timestamptz -> timestamp
"""
import pyarrow as pa
new_fields = []
new_columns = []
for field, column in zip(table.schema, table.columns):
# timestamp with timezone -> timestamp
if pa.types.is_timestamp(field.type) and field.type.tz is not None:
new_type = pa.timestamp(field.type.unit)
new_fields.append(pa.field(field.name, new_type, field.nullable))
new_columns.append(column.cast(new_type))
else:
new_fields.append(field)
new_columns.append(column)
new_schema = pa.schema(new_fields)
return pa.Table.from_arrays(new_columns, schema=new_schema)
def read_parquet_with_chdb(s3_input):
"""
Read Parquet file from S3 using chDB.
"""
import chdb
if s3_input.startswith("s3://"):
bucket, key = s3_input.replace("s3://", "").split("/", 1)
s3_url = f"https://{bucket}.s3.ap-northeast-1.amazonaws.com/{key}"
else:
s3_url = s3_input
print(f"Reading data from S3: {s3_url}")
query = f"""
SELECT *
FROM s3('{s3_url}', 'Parquet')
WHERE VendorID = 1
"""
result = chdb.query(query, "Arrow")
arrow_table = _to_pyarrow_table(result)
print("Original schema:")
print(arrow_table.schema)
# Normalize schema for Iceberg compatibility
arrow_table = normalize_arrow_for_iceberg(arrow_table)
print("Normalized schema:")
print(arrow_table.schema)
print(f"Rows: {arrow_table.num_rows:,}")
return arrow_table
def write_iceberg_table(arrow_table):
"""
Write Arrow table to Iceberg table using PyIceberg.
"""
try:
print("Writing started...")
from pyiceberg.catalog import load_catalog
catalog_config = {
"type": "glue",
"warehouse": "s3://your-bucket/your-warehouse/", # Adjust to your environment.
"region": "ap-northeast-1",
}
catalog = load_catalog("glue_catalog", **catalog_config)
table_identifier = "icebergdb.yellow_tripdata"
table = catalog.load_table(table_identifier)
print(f"Target data to write: {arrow_table.num_rows:,} rows")
table.append(arrow_table)
return True
except Exception as e:
print(f"Writing error: {e}")
import traceback
traceback.print_exc()
return False
def main():
try:
import chdb
import pyiceberg
# Read input parameter
s3_input = get_job_parameters()
# Read data with chDB
arrow_tbl = read_parquet_with_chdb(s3_input)
print(f"Data read success: {arrow_tbl.num_rows:,} rows")
# Write to Iceberg table
if write_iceberg_table(arrow_tbl):
print("\nWriting fully successful!")
else:
print("Writing failed")
except Exception as e:
print(f"Main error: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
main()
start_workflow_run (Workflow Execution)
In this pattern, Lambda receives the S3 file event and triggers a Glue Workflow using start_workflow_run.
The Workflow then runs the Glue Python Shell jobs.
Characteristics:
import boto3
def lambda_handler(event, context):
glue = boto3.client("glue")
s3_bucket = event['Records'][0]['s3']['bucket']['name']
s3_object_key = event['Records'][0]['s3']['object']['key']
s3_input_path = f"s3://{s3_bucket}/{s3_object_key}"
response = glue.start_workflow_run(
Name="YOUR_TARGET_WORKFLOW_NAME",
RunProperties={'--s3_input': s3_input_path}
)
print(f"Glue Workflow started: {response['RunId']}")
return response
1st Job
import boto3
import sys
import os
from awsglue.utils import getResolvedOptions
def get_job_parameters():
args = getResolvedOptions(sys.argv, ['WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
glue = boto3.client('glue')
resp = glue.get_workflow_run_properties(
Name=args['WORKFLOW_NAME'],
RunId=args['WORKFLOW_RUN_ID']
)
s3_input = resp['RunProperties'].get('s3_input')
if not s3_input:
raise ValueError("s3_input Not Found")
print(f"s3_input: {s3_input}")
return s3_input
def _to_pyarrow_table(result):
"""
Compatibility helper to extract a pyarrow.Table from a chDB query_result.
"""
import chdb
if hasattr(chdb, "to_arrowTable"):
return chdb.to_arrowTable(result)
if hasattr(result, "to_pyarrow"):
return result.to_pyarrow()
if hasattr(result, "to_arrow"):
return result.to_arrow()
raise RuntimeError(
"Cannot convert chdb query_result to pyarrow.Table. "
f"Available attributes: {sorted(dir(result))[:200]}"
)
def normalize_arrow_for_iceberg(table):
"""
Normalize Arrow schema for Iceberg compatibility.
- timestamptz -> timestamp
- binary -> string
"""
import pyarrow as pa
new_fields = []
new_columns = []
for field, column in zip(table.schema, table.columns):
# timestamp with timezone -> timestamp
if pa.types.is_timestamp(field.type) and field.type.tz is not None:
new_type = pa.timestamp(field.type.unit)
new_fields.append(pa.field(field.name, new_type, field.nullable))
new_columns.append(column.cast(new_type))
else:
new_fields.append(field)
new_columns.append(column)
new_schema = pa.schema(new_fields)
return pa.Table.from_arrays(new_columns, schema=new_schema)
def read_parquet_with_chdb(s3_input):
"""
Read Parquet file from S3 using chDB.
"""
import chdb
if s3_input.startswith("s3://"):
bucket, key = s3_input.replace("s3://", "").split("/", 1)
s3_url = f"https://{bucket}.s3.ap-northeast-1.amazonaws.com/{key}"
else:
s3_url = s3_input
print(f"Reading data from S3: {s3_url}")
query = f"""
SELECT *
FROM s3('{s3_url}', 'Parquet')
WHERE VendorID = 1
"""
result = chdb.query(query, "Arrow")
arrow_table = _to_pyarrow_table(result)
print("Original schema:")
print(arrow_table.schema)
# Normalize schema for Iceberg compatibility
arrow_table = normalize_arrow_for_iceberg(arrow_table)
print("Normalized schema:")
print(arrow_table.schema)
print(f"Rows: {arrow_table.num_rows:,}")
return arrow_table
def write_iceberg_table(arrow_table):
"""
Write Arrow table to Iceberg table using PyIceberg.
"""
try:
print("Writing started...")
from pyiceberg.catalog import load_catalog
catalog_config = {
"type": "glue",
"warehouse": "s3://your-bucket/your-warehouse/", # Adjust to your environment.
"region": "ap-northeast-1",
}
catalog = load_catalog("glue_catalog", **catalog_config)
table_identifier = "icebergdb.yellow_tripdata"
table = catalog.load_table(table_identifier)
print(f"Target data to write: {arrow_table.num_rows:,} rows")
table.append(arrow_table)
return True
except Exception as e:
print(f"Writing error: {e}")
import traceback
traceback.print_exc()
return False
def main():
try:
import chdb
import pyiceberg
# Read input parameter
s3_input = get_job_parameters()
# Read data with chDB
arrow_tbl = read_parquet_with_chdb(s3_input)
print(f"Data read success: {arrow_tbl.num_rows:,} rows")
# Write to Iceberg table
if write_iceberg_table(arrow_tbl):
print("\nWriting fully successful!")
else:
print("Writing failed")
except Exception as e:
print(f"Main error: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
main()
2nd Job
import boto3
import sys
import os
from awsglue.utils import getResolvedOptions
def get_job_parameters():
args = getResolvedOptions(sys.argv, ['WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
glue = boto3.client('glue')
resp = glue.get_workflow_run_properties(
Name=args['WORKFLOW_NAME'],
RunId=args['WORKFLOW_RUN_ID']
)
s3_input = resp['RunProperties'].get('s3_input')
if not s3_input:
raise ValueError("s3_input Not Found")
print(f"s3_input: {s3_input}")
return s3_input
def _to_pyarrow_table(result):
"""
Compatibility helper to extract a pyarrow.Table from a chDB query_result.
"""
import chdb
if hasattr(chdb, "to_arrowTable"):
return chdb.to_arrowTable(result)
if hasattr(result, "to_pyarrow"):
return result.to_pyarrow()
if hasattr(result, "to_arrow"):
return result.to_arrow()
raise RuntimeError(
"Cannot convert chdb query_result to pyarrow.Table. "
f"Available attributes: {sorted(dir(result))[:200]}"
)
def normalize_arrow_for_iceberg(table):
"""
Normalize Arrow schema for Iceberg compatibility.
- timestamptz -> timestamp
- binary -> string
"""
import pyarrow as pa
new_fields = []
new_columns = []
for field, column in zip(table.schema, table.columns):
# timestamp with timezone -> timestamp
if pa.types.is_timestamp(field.type) and field.type.tz is not None:
new_type = pa.timestamp(field.type.unit)
new_fields.append(pa.field(field.name, new_type, field.nullable))
new_columns.append(column.cast(new_type))
else:
new_fields.append(field)
new_columns.append(column)
new_schema = pa.schema(new_fields)
return pa.Table.from_arrays(new_columns, schema=new_schema)
def read_parquet_with_chdb(s3_input):
"""
Read Parquet file from S3 using chDB.
"""
import chdb
if s3_input.startswith("s3://"):
bucket, key = s3_input.replace("s3://", "").split("/", 1)
s3_url = f"https://{bucket}.s3.ap-northeast-1.amazonaws.com/{key}"
else:
s3_url = s3_input
print(f"Reading data from S3: {s3_url}")
query = f"""
SELECT *
FROM s3('{s3_url}', 'Parquet')
WHERE VendorID = 2
"""
result = chdb.query(query, "Arrow")
arrow_table = _to_pyarrow_table(result)
print("Original schema:")
print(arrow_table.schema)
# Normalize schema for Iceberg compatibility
arrow_table = normalize_arrow_for_iceberg(arrow_table)
print("Normalized schema:")
print(arrow_table.schema)
print(f"Rows: {arrow_table.num_rows:,}")
return arrow_table
def write_iceberg_table(arrow_table):
"""
Write Arrow table to Iceberg table using PyIceberg.
"""
try:
print("Writing started...")
from pyiceberg.catalog import load_catalog
catalog_config = {
"type": "glue",
"warehouse": "s3://your-bucket/your-warehouse/", # Adjust to your environment.
"region": "ap-northeast-1",
}
catalog = load_catalog("glue_catalog", **catalog_config)
table_identifier = "icebergdb.yellow_tripdata"
table = catalog.load_table(table_identifier)
print(f"Target data to write: {arrow_table.num_rows:,} rows")
table.append(arrow_table)
return True
except Exception as e:
print(f"Writing error: {e}")
import traceback
traceback.print_exc()
return False
def main():
try:
import chdb
import pyiceberg
# Read input parameter
s3_input = get_job_parameters()
# Read data with chDB
arrow_tbl = read_parquet_with_chdb(s3_input)
print(f"Data read success: {arrow_tbl.num_rows:,} rows")
# Write to Iceberg table
if write_iceberg_table(arrow_tbl):
print("\nWriting fully successful!")
else:
print("Writing failed")
except Exception as e:
print(f"Main error: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
main()
Example:
S3 → Lambda → Glue Workflow
├── Job1: Data ingestion & preprocessing
├── Job2: Data transformation
└── Job3: Validation & output
Example:
S3 → Lambda → Glue Job (single)
We introduced two patterns for triggering Glue Python Shell via S3 events.
When using Lambda to trigger Glue:
Glue Python Shell may seem like a niche service compared to Glue Spark, EMR, or Lambda, but it can be cost-effective, long-running, and Spark-independent.
Combining it with chDB or DuckDB can boost efficiency, and PyIceberg makes Iceberg integration straightforward.
While this article focused on S3-triggered jobs, Glue Python Shell can also be used as a general-purpose long-running job environment.
I hope this helps you design your ETL workflows and data platforms more effectively.
2026-01-31 22:09:08
This is a submission for the New Year, New You Portfolio Challenge Presented by Google AI
I am Abu Taher Siddik, a Full Stack Developer based in the quiet landscapes of Bargopi, Sunamganj. As an introvert, I’ve always found that words can be fleeting, but well-architected code is eternal.
My portfolio, designed with a "Celestial 7th Heaven" aesthetic, is intended to express my professional philosophy: that the most powerful digital solutions are often built in silence and focused isolation. I specialize in PHP and Python, bridge-building between complex backend logic and ethereal frontend experiences.
Building a portfolio that ranks #1 requires a blend of performance and storytelling. Here is the breakdown of my process:
I am most proud of the Project Synergy section. Highlighting my work on the Work AI Chat Studio (published on Codester) alongside my standalone version of WP Automatic demonstrates my ability to handle both high-level AI integration and deep-system automation.
Technically, I am particularly fond of the Starfield Canvas engine—it creates an immersive environment that proves a portfolio can be a work of art without sacrificing loading speed or SEO potential.
Location: Chhatak, Sunamganj, Bangladesh 🇧🇩
Specialty: PHP | Python | Full Stack Architecture
2026-01-31 22:05:31
Build in public is an interesting experiment overall. You get new readers, some of them even stick around, you start getting invited into different communities, you end up with a proud count of EIGHT stars on your repository and at some point you inevitably find yourself trying to fit into some LLM-related program just to get free credits and avoid burning through your own money too fast. I honestly think everyone should try something like this at least once, if only to understand how it actually feels from the inside.
At the same time there are obvious downsides. Writing updates every single week while having a full-time job requires a level of commitment that is harder to sustain than it sounds, because real life has a habit of getting in the way: a sick cat, a work emergency, getting sick yourself or just being too tired to produce something coherent. After a while it starts to feel uncomfortably close to a second job and I’ve had to admit that I’m probably not as committed to blogging as I initially thought I was. Honestly, keeping a build-in-public series going for more than a couple of months requires either a wealthy uncle or a very solid stock plan from a big company.
The work itself didn’t stop. Things kept moving, the system kept evolving and at some point it made sense to pause and do a proper recap of what we’ve actually been building. Yes, we skipped three weekly updates, but looking at the current state of the project, I’d say the result turned out pretty well.
Before getting into the details it’s worth briefly recalling how this started. Wykra began as a small side project built mostly for fun as part of a challenge, without any serious expectations or long-term plans, and somewhere along the way turned into whatever this is now. What it actually does: you tell it something like "vegan cooking creators in Portugal with 10k–50k followers on Instagram/Tiktok" and it goes hunting across Instagram and TikTok, scrapes whatever profiles it finds, throws them at a language model for analysis and gives you back a ranked list with scores and short explanations of why each profile ended up there. You can also just give it a specific username if you already have someone in mind and want to figure out whether they're actually worth reaching out to.
Since the original challenge post this has turned into a nine-post series on Dev.to and before moving on it's worth taking a quick look at how those posts actually performed.
As you can see the first two posts did pretty well and after that the numbers slowly went down. At this point the audience mostly consists of people who clearly know what they signed up for and decided to stay anyway.
At this point it makes more sense to stop talking and just show what this actually looks like now.
The first thing you hit is the landing page at wykra.io, which tries to explain what this thing is in about five seconds. I'm genuinely more proud of the fact that we have a landing page at all than of the fact that we even have a domain for email. Also please take a moment to appreciate this very nice purple color, #422e63, because honestly it's pretty great.
We also have a logo that Google Nano Banana generated for us, it’s basically connected profiles drawn as a graph, which is exactly what this thing is about.
After that you can sign up via GitHub because we still need some way to know who's using this and prevent someone from scraping a million dollars' worth of data in one go. Once you're in, you end up in a chat interface that keeps the full history and very openly tells you that searches can take a while, up to 20 minutes in some cases. Sadly there's no universe where this kind of discovery runs in five or six seconds. That's just how it works when you're chaining together web search, scraping and LLM calls.
Eventually you get back a list of profiles the system thinks are relevant, along with a score for each one and a short explanation of why it made the cut.
You can also ask for an analysis of a specific profile if you want to sanity-check whether someone is actually any good.
When you do that you get a quick read on what the account is actually about: the basic stats, a short summary written in human words and a few signals around niche, engagement and overall quality. It's not trying to pass final judgment on anyone, it just saves you from opening a dozen tabs and scrolling for twenty minutes to figure out whether a profile looks legit.
You can also use the whole thing directly in Telegram if the web version isn't your style. Same interface, same flows, just inside Telegram instead of a browser.
For anyone who cares about how this is actually put together, here’s the short version of the stack.
The backend is built with NestJS and TypeScript with PostgreSQL as the main database and Redis handling caching and job queues. For scraping Instagram and TikTok we use Bright Data, which takes care of the messy part of fetching profile data without us having to fight platforms directly. All LLM calls go through LangChain and OpenRouter, which lets us switch between different models without rewriting half the code every time we change our mind. Right now Gemini is the main workhorse and GPT with a web plugin handles discovery, but the whole point is that this setup stays flexible. Metrics are collected with Prometheus, visualized in Grafana and anything that breaks loudly enough ends up in Sentry.
The frontend is React 18 with TypeScript, built with Vite and deliberately boring when it comes to state management. Just hooks, no extra libraries. It also plugs into Telegram's Web App SDK, which is why the same interface works both in the browser and inside Telegram without us maintaining two separate apps.
If you're the kind of person who prefers one picture over five paragraphs of explanation, this part is for you. Below is a rough diagram of how Wykra is wired up right now. It's not meant to be pretty or final, just a way to see where things live and how data moves through the system.
If you trace a single request from top to bottom, you're basically watching what happens when someone types a message in the app: the API accepts it, long-running work gets pushed into queues, processors do their thing, external services get called, results get stored and errors get yelled about.
All LLM calls go through OpenRouter with Gemini 2.5 Flash doing most of the day-to-day work like profile analysis, context extraction and chat routing and GPT-5.2 with the web plugin used specifically for discovering Instagram profile URLs.
All LLM calls → OpenRouter API
├─ Gemini 2.5 Flash (primary workhorse)
│ ├─ Profile analysis
│ ├─ Context extraction
│ └─ Chat routing
│
├─ GPT-5.2 with web plugin
│ └─ Instagram URL discovery
Searching for creators on Instagram is a bit of a dance, because Bright Data can scrape profiles but doesn't let you search Instagram directly. So we first ask GPT with web search to find relevant profile URLs and only then scrape and analyze those profiles.
For TikTok things are simpler because Bright Data actually supports searching there directly. So we skip the whole "ask GPT to find URLs" step and just tell Bright Data what to look for.
Honestly? Search doesn't work perfectly yet. Some results are great, some are questionable and there are edge cases where the system does something a bit странное. That's expected when you're stitching together web discovery, scraping and LLM analysis into one pipeline. Right now we're working on making the results more relevant and making the whole thing cheaper to run, because discovering creators should not feel like lighting money on fire.
But that's work for next week.
For now, if you want to dig into the code, everything lives here: https://github.com/wykra-io/wykra-api
And if you've made it all the way to the end and have thoughts, questions, or strong opinions about how this is built, feel free to share them. That's kind of the point of doing this in public.