Nexteam is sponsoring this newsletter.
Courtesy of Tayla Kohler - Unsplash
New Stealthy 'Krasue' Linux Trojan Targeting Telecom Firms in Thailand
A new Linux remote access trojan named Krasue, after a Southeast Asian spirit, has been targeting telecom companies in Thailand since at least 2021. Its initial access methods are unclear but may involve exploiting vulnerabilities or brute-force attacks. Krasue uses a rootkit, resembling an unsigned VMware driver, for persistence and stealth. This rootkit, derived from open-source projects, enables the trojan to hide its activities. Krasue's similarities with XorDdos malware suggest a common author or source code access. Group-IB has identified one confirmed case and is investigating three potential incidents, highlighting the need for continuous vigilance in cybersecurity.
https://thehackernews.com/2023/12/new-stealthy-krasue-linux-trojan.html
vCluster — Kubernetes In Kubernetes In Kubernetes
https://bmiguel-teixeira.medium.com/vcluster-kubernetes-in-kubernetes-in-kubernetes-35af8cad9e48
Observability Is About Confidence
A case can be made for considering observability instrumentation as an integral element of software development, akin to the status that unit testing has achieved.
https://www.honeycomb.io/blog/observability-is-about-confidence
A look at CVE-2023–23415 — a Windows ICMP vulnerability + mitigations (which is not a cyber meltdown)
ext4 data corruption in 6.1 stable tree
The issue was first reported by Daniel Díaz about regressions observed in certain tests (ltp-syscalls' preadv03) on various environments including dragonboard-845c, qemu-arm64, and others, affecting kernel versions like v5.10.202-rc1, v5.15.140-rc1, and v6.1.64-rc1. Jan Kara acknowledges the report and asks for tests on the current upstream kernel.
Further investigation revealed that the failure is due to an interaction between iomap code and ext4 code, specifically related to a commit (936e114a245b6
) not present in stable releases. This absence caused file position not to be updated after direct IO write, leading to data corruption. The issue is present in all stable kernels that include the commit 91562895f803
before version 6.5.
Jan Kara urges Greg Kroah-Hartman to remove the problematic commit from all stable kernels before 6.5 as soon as possible to prevent data corruption. A proper backport will be figured out later.
https://lore.kernel.org/stable/20231205122122.dfhhoaswsfscuhc3@quack3/
A Glimpse into the Redesigned Goku-Ingestor vNext at Pinterest
The article discusses the evolution and optimization of Pinterest's real-time metrics asynchronous data processing pipeline, known as Goku-Ingestor, which powers the time series database Goku. The pipeline, which has been running for nearly a decade, initially faced challenges like high fleet costs for perceived throughput and reliability issues, including data loss.
Key problems identified included high garbage collection (GC) overhead and inefficiencies in the concurrency model. The old architecture involved creating new threads for every batch of metrics and duplicating data points at each processing stage. This led to frequent young GC pauses and overall inefficiency.
The Goku-Ingestor vNext Architecture aimed to achieve high throughput with minimal hosts. It introduced a 3-thread pool executor model, reducing temporary objects and optimizing resource usage. Key improvements included eliminating redundant processes and minimizing GC overhead. Memory profiling helped identify expensive APIs, like the string.split function, and optimize them.
The results of these improvements were significant. Goku-Ingestor vNext required 50% to 65% fewer EC2 instances, with GC pause times reduced to just 10% to 25% of their original duration. This not only improved throughput and system performance but also resulted in substantial resource savings. The article showcases the transformative effects of the new system, demonstrating improved efficiency with a reduced number of hosts.
Russ Cox at ACM SCORED: Open Source Supply Chain Security at Google
Notes from Five Days of Re:Invent
https://medium.com/@AaronKalair/aws-reinvent-2023-day-1-monday-6a6c1c938b2e
https://medium.com/@AaronKalair/aws-reinvent-2023-day-2-tuesday-28d8ca739396
https://medium.com/@AaronKalair/aws-reinvent-day-3-wednesday-e8d2a144e0f1
https://medium.com/@AaronKalair/aws-reinvent-2023-day-4-thursday-095aafec0deb
https://medium.com/@AaronKalair/aws-reinvent-2023-day-5-friday-f9cf24273523
Moving to the Cloud Is More than Just a Purchasing Exercise
Major Hayden rightly points out that transitioning to the cloud involves much more than merely a procurement process. Treating it solely as a purchasing activity is a shortsighted approach that can lead to significant financial losses in multiple ways.
https://major.io/p/cloud-more-than-purchasing-exercise/
Amazon EC2 C6gd and R6gd instances are now available in AWS GovCloud (US-East) Region
AWS Secrets Manager announces 99.99% Service Level Agreement
https://aws.amazon.com/about-aws/whats-new/2023/12/aws-secrets-manager-service-level-agreement/
You're so worried about AWS reliability, the cloud giant now lets you simulate major outages
Fake it 'til you break it, for a whole availability zone or WAN FAIL
https://www.theregister.com/2023/12/01/aws_az_fault_injection_service/
Generally Available: AKS support for API breaking change detection
GA: Azure Kubernetes Service (AKS) support for 5K Node limit by default for Standard tier clusters
Generally available: PostgreSQL 16 support in Azure Database for PostgreSQL – Flexible Server
Deploying an OPNsense/pfSense Hyper-V virtual machine
https://blog.workinghardinit.work/2023/12/05/deploying-an-opnsense-pfsense-hyper-v-virtual-machine/
How to use a cloud-native DNS resolver in Azure
https://autosysops.com/blog/cloud-native-dns-resolver-in-azure
Newsletter sponsor: Nexteam
Technology, Experience, Delivered.
Thanks for reading the Infra Weekly Newsletter! Subscribe for free to receive new posts and support my work.