1. home
  2. #tags
  3. Assembly

Discover Latest #Assembly News, Articles and Videos with Contenting

The Assembly has been featured in the following news articles, videos and other media. 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009

Secure your website wherever you are (Sponsored by SSLs.com)

Show your company is legit, from $36.88ad

Buy a domain and everything else you need (Sponsored By namecheap.com)

adad

Shiny in Production 2025: Full Length Talks | R-bloggers

We are pleased to announce the full line-up for this year’s Shiny in Production conference! The conference includes nine full-length talks (25 minutes each) and a lightning talk session (5 minutes per talk), we’ll cover those in a separate blog. Register now Talks Cameron Race - Head of Children and Schools Statistics and Product Manager shinyGovstyle: A ‘Shiny’ Secret Weapon for Production-Ready Government Public Services In the UK, we are required to make public sector websites accessible to all users. While there is a wealth of UK government data publicly available through a number of existing digital services, it can be tough to engage with. Government analysts are increasingly turning to R Shiny to enhance their data dissemination, making it more engaging for users, but with hundreds of analysts working in silos across government, how can analysts build full digital services in a way that carries the same consistency, trustworthiness and authority as a domain such as GOV.UK? Charlie Gao - Posit Software, PBC Advances in the Shiny Ecosystem Charlie Gao, Senior Software Engineer on Posit’s open source team will review some of the latest high-performance async tooling developed by Posit to support R Shiny in terms of performance, scalability and user experience. Colin Fay - ThinkR After {shiny} — Bringing R to Mobile with webR As the use of mobile devices becomes increasingly central to how users interact with data products, the R community has long sought ways to bring R-powered applications into the mobile space. Historically, this has meant adapting {shiny} apps for smaller screens—either through responsive design or packages like {shinyMobile}. While effective for certain use cases, these approaches are fundamentally web-based, requiring a server and a stable internet connection, and lacking access to native device features. This talk presents a new path forward: Rlinguo, a fully native mobile application built with webR, a version of R compiled to WebAssembly. Unlike traditional {shiny}-based solutions, Rlinguo runs R directly on the device, without a server. It works offline, stores data locally, and can leverage native mobile APIs—pushing the boundaries of what’s possible with R in a mobile context. Through this case study, we’ll explore the architecture behind Rlinguo, contrast it with the {shiny} model, and discuss what it means for the future of R development. Topics will include: What it takes to embed R in a mobile app using webR Technical and design trade-offs between web-based and native solutions Practical applications for offline, device-integrated R tools Whether you’re building with {shiny} today or simply curious about the next evolution of R in production, this session offers a look at where R can go when it steps beyond the browser. Gabriela De Lima Marin - Brazilian Network Information Centre Bringing Connectivity Data Together: An R Shiny Platform for Public Schools This project presents a collaborative initiative aimed at improving the geolocation accuracy of Brazilian public schools through an interactive Shiny web application. By integrating existing location data from the Brazilian School Census with APIs from Google, Microsoft, and OpenStreetMap, we established an innovative workflow to assign accurate geographic coordinates to schools previously lacking precise location data. The Shiny application provides a user-friendly interface allowing school administrators and education managers to visually verify and manually adjust school locations via interactive maps. Over the past two years, this approach enabled the precise geolocation of previously unlocated schools and significantly enhanced the accuracy of geolocation data of schools. The geolocation data collected and validated through this project will be openly shared with relevant governmental stakeholders, promoting transparency and supporting evidence-based decision-making. Moreover, the project exemplifies how collaborative data science and innovative web technology—particularly R Shiny—can be effectively leveraged in public administration, enabling managers, stakeholders, and the community to directly contribute to data accuracy and positively influence educational outcomes in Brazil. Jack Anderson - National Disease Registration Service, NHS England Transforming the reporting of national patient outcomes with Shiny: 30-day mortality post-Systemic Anti-Cancer Therapy In June 2020, the National Disease Registration Service began reporting 30-day mortality post-Systemic Anti-Cancer Therapy (SACT) Case-Mix Adjusted Rates (CMAR) to NHS trusts in England. This work applies logistic regression to report trust-level case-mix adjusted 30-day mortality rates, which enable comparisons between trusts and with the national average. Historically, results were shared as an Excel workbook with an accompanying companion brief and FAQ document, and each report was shared in isolation from previous releases. Since April 2023, implementation of R Shiny has enabled 30-day mortality rates to be reported seamlessly on an interactive, publicly accessible dashboard. Utilising the Plotly and DT packages, dynamic funnel plots and data tables are tailored to user needs through Shiny input pickers, which reactively subset and summarise data visualisations based on user selections. This enables NHS trust users to flexibly review their 30-day mortality outcomes against those of other trusts, their wider Cancer Alliance, and national averages, both overall and stratified by key patient demographics. The Shiny dashboard also enables users to view current and previous CMAR reports together in one place and includes download button functionality for documentation and underlying data. With dedicated tabs for summary data, trust exclusions, and trust response statements, Shiny allows for end-to-end exploration of CMAR outcomes, making it easier for users to gain insight into clinical practice. The resulting Shiny dashboard supports clinical governance within trusts and enables clinical colleagues to better understand their patient outcomes within their wider context. Laura Mawer & Marcus Palmer - Datacove, Harrison-Palmer Limited Using Shiny for Python to Power AI-Driven University Application Forecasting Universities face growing uncertainty in student recruitment, making accurate forecasting critical for strategic and financial planning. Athena is an AI-powered prediction tool that leverages Shiny for Python to provide real-time insights into application trends. By combining machine learning (Random Forest models), trend analysis, and interactive scenario planning, Athena enables universities to test recruitment strategies, adjust campaign spending, and instantly see the projected impact on future application numbers. This talk will explore how Shiny for Python was used to develop a fully interactive forecasting tool without requiring extensive front-end development. We will discuss why Shiny for Python was chosen, how it integrates with a machine learning pipeline, and how it powers real-time scenario analysis with dynamic dashboards. Additionally, we’ll demonstrate how AI-generated recommendations via an API enhance decision-making, providing actionable insights tailored to user-selected scenarios. Attendees will gain practical knowledge on building AI-driven, interactive applications using Shiny for Python, implementing predictive models, and designing intuitive decision-support tools for non-technical users. The session will conclude with a live demo, showing Athena in action and sharing best practices for deploying Shiny for Python in production. This talk is designed for developers, data scientists, engineers, and senior decision-makers looking to leverage AI-powered forecasting, business intelligence, and strategic planning in a real-world application. Nic Crane - NC Data Labs htmlwidgets Are a Secret Sauce in R – Can LLMs Make Them the Perfect Condiment? htmlwidgets quietly power some of the most compelling Shiny apps out there, but writing them from scratch can be fiddly and time-consuming. In this talk, we’ll kick things off by taking an audience-sourced ingredient list and asking a large language model to whip up a fresh htmlwidget. Then we’ll plate up a version we prepared earlier - also model-generated - but chopped, seasoned, and finished with our own touches. Along the way, we’ll explore how LLMs can assist in crafting htmlwidgets that reflect your flavour of R - from tidy eval to package structure - rather than sticking to a bland house style. For updates and revisions to this article, see the original post

Learning Antimicrobial Resistance (AMR) genes with Bioconductor | R-bloggers

Instead of flashcards, we Rube Goldberg’d this with Bioconductor! Analyzed 3,280 E. coli genomes from NCBI, detecting ESBL genes in 84.4% of samples. CTX-M-15 was most common. Helped us understand gene nomenclature and sequence analysis! 📊🔬 Motivation I’ve always had a hard time learning and remembering all these genes for antimicrobial resistance (AMR). Yes, we can probably create some nice flash cards and try to memorize that way. But, why do the easy way when we can Rube Goldberg machine this! And use this as an opportunity to revisit Bioconductor and learn it! Let’s go! Thought Process But, how? There is so much to learn! Well, to make it somewhat clinically applicable, or at least bridge the gap of understanding, is answer some of my own questions: What genes control the production of extended spectrum beta lactamase (ESBL) in Escherichia coli (E. coli)? Do these genes have the same DNA sequence across species and genus? Disclaimer I am not a bioinformatician and do not work with amr, the articles and method presented is my attempt to form a better memory association towards clinical amr and the genetic terminologies that are usually used by microbiologists. Please take this with a grain of salt. Verify the information presented. If you noted some error in this article, please let me know so that I can learn! Also, some of the analysis results were not run during rmarkdown knitting because that causes a significant delay, however, the results posted here should be reprodicuble. Please again let me know if they are not Objectives How To Download E. Coli Data? How To Download Class A Beta Lactamase Genes? How To Detect Gene? Let’s Go All In and Assess ALL Available ESBL Ecoli in NCBI Answer To My Questions Opportunity for Improvements Lessons Learnt How To Download E. Coli Data ? Let’s select 2 different groups. The first group is just a search of the bacteria and download the first 10. The second group we’ll specifically filter out ESBL group. Regular E. Coli Click here to go to this page. Select the first 10 E. coli and then click Download > Download Package The selection should be default like above, then hit Download and you’ll have a zip file. Once you downloaded the zip file above. After unzipping it, go to the data folder and you’ll have quite a few folders with their NCBI RefSeq assembly like so. Each folder contains fna file (FASTA file containing nucleotide sequences) of the bacteria that contains the whole genome sequence of that particular strain of bacteria. Let’s take a look at the first assembly! (sequence filter(assembly != "none") |> group_by(file) |> mutate(n = row_number()) |> filter(n == 1) |> ungroup(file) |> select(esbl_seq) |> mutate(gene=str_extract(esbl_seq,"(?

Learning The Basics of Phylogenetic Analysis | R-bloggers

🧬🔬 Explore phylogenetic analysis from genome to tree! Basic workflow with R/Bioconductor. Learnt to work with large genomic dataset. Extract 16S rRNA from 10K+ E.coli strains using dataset dehydrate, barrnap for extraction, rapidNJ for tree building & FigTree for visualization. Movitation: After that last hands-on experience on Bioconductor, we will continue our journey in phylogenetic analysis. I’ve always been intrigued in how biologists piece these phylogenetic tree together and I want to know the big idea of how this is done. We’ll again be using Bioconductor. Let’s go! Disclaimer: I am not a bioinformatician and do not work with genes directly, the articles and method presented is my attempt to get a birds eye view on how we went from different isolates to piecing them together onto a single tree. Please take this with a grain of salt. Verify the information presented. If you noted some error in this article, please let me know so that I can learn! Also, some of the analysis results were not run during rmarkdown knitting because that causes a significant delay, however, the results posted here should be reprodicuble. Please again let me know if they are not The End Goal: Get a basic workflow of how this is done Assess Many Many Ecoli strains Assess Different Genus on a single tree Looks quite doable! Along the line, we may have some deeper dive to look at the machinery behind. Ultimately, we want to visualize beuatiful trees! 🌴 Like this. Objectives: What is phylogenetic analysis? The workflow Extract 16S rRna Align Distance Calculation Tree construction Visualizing Phylogenetic tree Ecoli Large Dataset Other Genus Opportunities for improvement Lessons Learnt What is phylogenetic analysis? Phylogenetic analysis is a method used to study the evolutionary relationships between organisms. It involves comparing genetic sequences, such as DNA or RNA, to infer how species are related through common ancestry. This analysis can help identify how different organisms have evolved over time and can be particularly useful in contact tracing, outbreak assessment. Here is a quick wiki Let’s Download ALL Ecoli Fasta One of our previous opportunity improvement is to learn to use NCBI dataset CLI. We’ll be using that this time because if you try to download more than 10,000 of fasta via website or event the CLI itself, it won’t be as smooth, at least from my multiple tries. I’ve attempted maybe 5-6 times and couldn’t complete the download even when I used datasets and unable to resume downloads. However, the most stable is actually to download dehydrated then rehydrate. See here Terminal curl -o datasets 'https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v2/mac/datasets' chmod +x datasets dataformat datasets download genome taxon "Escherichia coli" \ --assembly-level scaffold,chromosome,complete \ --dehydrated \ --filename ecoli_high_quality_dehydrated.zip unzip ecoli_genomes_dehydrated.zip cd ncbi_dataset datasets rehydrate --directory For large genome downloads, use datasets dehydrated then rehydrate. It’s easier to resume download if connection got lost. You notice above that we left off contig and just included scaffold, chromosome, and complete? That’s because there were 300,000 sequences if we include all of them. I definitely don’t need all of those packages. With scaffold, chromosome, and complete, we got 30,000+ and the entire zip file was 58 gigs of data. 😵‍💫 Just for my education: Contig - Short for “contiguous sequence.” This is a continuous stretch of DNA sequence with no gaps. Contigs are assembled from overlapping sequencing reads, but they represent isolated pieces without known relationships to other contigs. Scaffold - A collection of contigs that have been ordered and oriented relative to each other, often with gaps of estimated size between them. Scaffolding uses additional information like paired-end reads or mate-pair libraries to determine how contigs should be arranged, even when the sequence connecting them isn’t fully resolved. Chromosome - Contigs and scaffolds have been assembled into chromosome-scale sequences that represent entire chromosomes. This typically requires additional long-range information like Hi-C data, optical mapping, or long-read sequencing to achieve proper chromosome-level organization. Complete - The highest level of assembly where the entire genome is finished with no gaps, including challenging repetitive regions, centromeres, and telomeres. This represents a truly complete, end-to-end sequence of all chromosomes. The Workflow The workflow for phylogenetic analysis typically involves several key steps: Extract 16s rRNA -> Align -> Calculate Distance -> Construct Tree Extract 16S rRNA This step involves obtaining the genetic material from the organisms of interest, such as bacteria or viruses. For example, in the case of antibiotic resistance, researchers might extract the 16S rRNA gene, which is commonly used for bacterial identification. The reason for extracting 16S rRNA in bacteria is beccause the sequence are highly conserved across different species, making it a reliable marker for phylogenetic analysis. The 16S rRNA gene is present in all bacteria and archaea, and its sequence can provide insights into the evolutionary relationships between different microbial species. Let’s dive into the code to do this. We’ll use the same 2 groups we used before, regular Ecoli and ESBL Ecoli from before library(Biostrings) library(DECIPHER) # load data path1