1. home
  2. #tags
  3. Ubuntu

Discover Latest #Ubuntu News, Articles and Videos with Contenting

1. Ubuntu 21.04 'Hirsute Hippo' Now Available for Download: https://news.softpedia.com/news/ubuntu-21-04-hirsute-hippo-now-available-for-download-531156.shtml 2. Ubuntu 21.04: A Brief Overview of the Latest Features: https://itsfoss.com/ubuntu-21-04-overview/ 3. Video Guide: How to Install Ubuntu 21.04: https://www.youtube.com/watch?v=e5y5lgV1YpA 4. How to Install and Configure Apache on Ubuntu 21.04: https://linuxize.com/post/how-to-install-apache-on-ubuntu-21-04/ 5. Ubuntu 21.04 Review: A Solid Linux Distribution for Everyone: https://www.omgubuntu.co.uk/2021/04/ubuntu-21-04-review

Secure your website wherever you are (Sponsored by SSLs.com)

Show your company is legit, from $36.88ad

Buy a domain and everything else you need (Sponsored By namecheap.com)

adad

Reproducible data science with Nix, part 11 — build and cache binaries with Github Actions and Cachix | R-bloggers

Intro I have this package on CRAN called {chronicler} and last month I got an email from CRAN telling me that building the package was failing, and I had two weeks to fix it. I immediately thought that some dependency that my package depends on got updated, and somehow broke something. But when I checked the results of the build, I was surprised, to say the least: How come my package was only failing on Fedora? Now that was really weird. There was no way this was right. Also, I couldn’t reproduce this bug on my local machine… but I could reproduce it on Github Actions, on Ubuntu (but it was ok on CRAN’s Debian which is really close to Ubuntu!), but couldn’t reproduce it either on Windows! What was going on? So I started digging, and my first idea was to look at the list of packages that got released on CRAN on that day (March 12th 2024) or just before, and saw something that caught my eye: a new version of {tidyselect} had just been released and even though my package doesn’t directly depend on it, I knew that this package was likely a dependency of some direct dependency of {chronicler}. So I looked into the release notes, and there it was: * `eval_select()` out-of-bounds errors now use the verb "select" rather than "subset" in the error message for consistency with `dplyr::select()` (#271). I knew this was what I was looking for, because the unit test that was failing to pass was a test that should error because dplyr::select() was being used on a column that didn’t exist. So the success of that test was defined as finding the following error message in the log, which contained the word subset but now it should be select. But why was this failing only on Fedora on CRAN and on Ubuntu on Github Actions (but ok on Debian on CRAN)? And why couldn’t I reproduce the bug on my OpenSuse Linux computer, even though I was building a bleeding edge development environment using Nix? And then it hit me like my older brother used to. When building packages, CRAN doesn’t seem to use pre-compiled binaries on Fedora, so packages get built from source. This means that it takes longer to test on Fedora, as packages have to be built from source, but it also means that only the very latest releases of packages get used. On other platforms, pre-compiled binaries get used if available, and because {tidyselect} had just come out that very day, older binaries of {tidyselect} were being used on these platforms, but not on Fedora. And because these older binaries didn’t include this change, the unit test was still passing successfully on there. On Github Actions, code coverage was computed using covr::codecov() which installs the package in a temporary directory and seems to pull its dependencies directly from CRAN. Because CRAN doesn’t offer Linux binaries packages got compiled from source, hence why the test was failing there, as the very latest version of {tidyselect} was being used (btw, use Dirk Eddelbuettel’s r2u if you binaries for Ubuntu). And on my local machine, even though I was using the latest commit of nixpkgs to have the most bleeding edge packages for my environment, I had forgotten that the R packages on nixpkgs always lag behind the CRAN releases. This is because R packages on nixpkgs tend to get updated alongside a new release of R, and the reason is to ensure a certain level of quality. You see, the vast majority of CRAN (and Bioconductor) packages are made available through nixpkgs in a fully automated way. But some packages do require some manual intervention to work on Nix. And we only know this if we try to build these packages, but building packages requires quite a lot of resources. I go into more detail here, but in summary we can’t build CRAN packages every single day to see if everything works well, so we only rebuild the whole tree whenever there’s a new release of R. Packages get built on a CI infrastructure called Hydra, and then get cached on cache.nixos.org so whenever someone wants to install a package, a pre-built binary gets pulled from the cache instead of getting installed from source. For packages that don’t need compiling this is not that big of a time save, but for packages that do need to get compiled it is huge. Depending on which packages you want to install, if you had to build everything from source, it could potentially take hours, but if you can install pre-built binaries it’s just a matter of how quick your internet connection is. Anyways, I went back to my fork of nixpkgs and updated the expression defining the CRAN packages myself and installed the latest versions of packages from my fork. Before the update, this was the error message I was testing against: and this was on version 1.2.0 of {tidyselect}: but after the update, this was the error message: on version 1.2.1 of {tidyselect}: so I found the issue, and updated my unit testing accordingly, and pushed the update to CRAN. All is well that ends well, but… this made me think. I needed to have an easy way to have bleeding edge packages on hand from Nix at all moments, and so I started working on it. Github Actions to the rescue As described in my previous blog post updating the Nix expressions defining the R packages on nixpkgs involves running an R script that generates a Nix expression which then builds the R packages when needed. So what I did was create a Github actions that would run this R script every 6 hours, and push the changes to a branch of my nixpkgs fork. This way, I would always have the possibility to use this branch if I needed bleeding edge packages. Because this can be of interest to others, Philipp Baumann started a Github organisation hosting this fork of nixpkgs that gets updated daily which you can find here. Because this action needs to run several times a day, it should be on a schedule, but actions on a schedule can only run from master/main. But that’s not what we wanted, so instead, we are using another action, on another repository, that pushes a random file to the target repository to get the action going. You can find this repository here with complete instructions. So to summarise: An action on schedule runs from b-rodrigues/trigger-r-updates and pushes a file to rstats-on-nix/nixpkgs on the r-daily-source branch This triggers an action that updates all of nixpkgs, including R packages, and pushes all the updates to the r-daily branch (you can find it here) We can now use the r-daily branch to get bleeding edge R packages on Nix! This happens without any form of testing though, so packages could be in a broken state (hey, that’s the definition of bleeding edge, after all!), and also, if anyone would like to use this fork to build a development environment, they’d have to rebuild a lot of packages from source. Again, this is because these packages are defined in a fork of nixpkgs and they don’t get built on Hydra to populate the public cache that Nix uses by default. So while this fork is interesting because it provides bleeding edges packages, using it on a day-to-day basis can be quite tedious. And this is where Cachix comes into play. Setting up your own binary cache on Cachix Cachix is an amazing tool that makes it incredibly easy to set up your own cache. Simply build the packages once, and push the binaries to the cache. As long as these packages don’t get updated, they’ll get pulled from the cache instead of getting rebuilt. So now, here is what I do with my packages: I define a default.nix file that defines a development environment that uses my fork of nixpkgs as the source for packages. For example, here is this file that defines the environment for my {rix} package. I can use this environment to work on my package, and make sure that anyone else that wants to contribute, contributes using the same environment. As you can see on line 2, the rstats-on-nix bleeding edge fork gets used: pkgs = import (fetchTarball "https://github.com/rstats-on-nix/nixpkgs/archive/refs/heads/r-daily.tar.gz") {}; Then, still on {rix}’s repository, I define a new action that builds this environment periodically, but using the binary cache I set up with Cachix. You can find this action here. So the r-daily branch of our nixpkgs fork gets updated every 6 hour and this environment gets updated every 12 hours, 30 minutes past the hour. Now, every time I want to work on my package, I simply use nix-build on my computer to update the development environment. This is what I see: copying path '/nix/store/0l0iw4hz7xvykvhsjg8nqkvyl31js96l-r-stringr-1.5.1' from 'https://b-rodrigues.cachix.org'... copying path '/nix/store/cw3lc7b0zydsricl5155jbmldm1vcyvr-r-tibble-3.2.1' from 'https://b-rodrigues.cachix.org'... copying path '/nix/store/y32kpp09l34cdgksnr89cyvz6p5s94z8-r-tidyselect-1.2.1' from 'https://b-rodrigues.cachix.org'... copying path '/nix/store/sw24yx1jwy9xzq8ai5m2gzaamvyi5r0h-r-rematch2-2.1.2' from 'https://b-rodrigues.cachix.org'... copying path '/nix/store/z6b4vii7hvl9mc53ykxrwks1lkfzgmr4-r-dplyr-1.1.4' from 'https://b-rodrigues.cachix.org'... as you can see, packages get pulled from my cache. Packages that are already available from the usual, public, cache.nixos.org don’t get rebuilt nor cached in mine; they simply continue getting pulled directly from there. This makes using the development environment very easy, and guarantees I’m always mirroring the state of packages released on CRAN. The other interesting thing is that I can use that cache with other actions. For example, here is the action that runs the unit tests included in the package in an environment that has Nix installed on it (some unit tests need Nix to be available to run). On line 25 you can see that we install Nix and set our fork as the repository to use: nix_path: nixpkgs=https://github.com/rstats-on-nix/nixpkgs/archive/refs/heads/r-daily.tar.gz and just below, we set up the cache: - uses: cachix/cachix-action@v14 with: name: b-rodrigues # this is the name of my cache By using my cache, I make sure that the test runs with the freshest possible packages, and don’t run the risk of having a test succeed on an outdated environment. And you might have noticed that I am not authenticating to Cachix: to simply pull binaries, to authentication is needed! Cachix has a free plan of up to 5Gb which is more than enough to set up several development environments like this, and is really, really, easy to set up, and it works on your computer and on Github Actions, as shown. If you want to use this development environment to contribute to {rix}, check out the instructions on Contributing.md file. You can use the same approach to always have development environments ready for your different projects, and I will likely add the possibility to use this fork of nixpkgs with my {rix} package. Thanks to Philipp Baumann for nudging me into the direction of using Cachix and showing the way! Hope you enjoyed! If you found this blog post useful, you might want to follow me on Mastodon or twitter for blog post updates and buy me an espresso or paypal.me, or buy my ebooks. You can also watch my videos on youtube. So much content for you to consoom! Buy me an Espresso

Running MLwiN using mlnscript via the R2MLwiN R package on Apple Silicon Macs | R-bloggers

Introduction MLwiN from the Centre for Multilevel Modelling (CMM) at the University of Bristol (disclaimer: where I also work) is a fantastic piece of software (Charlton et al. 2024). The name suggests it only works on Windows, but as we’ll find out this is very much not the case. However, in the past this was sort of true because to make it work on a Mac (or Linux machine) with an Intel processor one would need to run it using Wine. More recently, CMM have cleverly made the MLwiN libraries available for other operating systems in a command line version of the program called mlnscript and an accompanying library. The files for macOS are universal binaries which means that they run natively on both Intel and Apple Silicon Macs. Let’s find out how to set this up.1 Setting up mlnscript on an Apple Silicon Mac Obtain the installer for macOS. See the relevant download page (depending upon whether you are an academic) on the MLwiN website. On the form on the File to download dropdown menu select the mlnscript for MacOS option. This will give you the MLN.dmg installer. Double click the installer. On macOS it is recommended to install the files into the /opt/mln/ directory, which you will need to create with Admin permissions, or install to another directory if you don’t have Admin permissions. Copy the 2 files mlnscript and libmln.dylib into the /opt/mln (or other) directory. Once installed we can check that mlnscript and libmln.dylib are universal binaries as follows (we could also use the file command). lipo -archs /opt/mln/mlnscript ## x86_64 arm64 Since both architectures are listed in the output this indicates the files are universal binaries. Apple Silicon Macs will use the arm64 architecture. Now we need to grant the two files permission to run. To do this run the following in your Terminal. /opt/mln/mlnscript --version On first run, this will fail with a pop-up similar to the following. Click Cancel and then go into the System settings | Privacy & Security and scroll down and click Allow Anyway. Then running the version check command again you may receive another popup in which you click Open. After this the first popup will then appear but about the libmln.dylib file. Again set the security setting to Allow All. Now running the version check command again you should see the version number – which is currently 3.10. /opt/mln/mlnscript --version ## 3.10 In R we then install the R2MLwiN package from CRAN (Zhang et al. 2016). install.packages("R2MLwiN") This completes the setup - phew 😮! Running a multilevel model For an example we could run one of the demos in the package, we can list those with the following code. demo(package = "R2MLwiN") We can run one, for example, let’s fit the random intercept model from the UserGuide02 demo. library(R2MLwiN) # if you did not install mlnscript and libmln.dylib in /opt/mln , set: # options(MLwiN_path = "/path-to/mlnscript") (mymodel1 #> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- #> MLwiN (version: unknown or >3.09) multilevel model (Normal) #> Estimation algorithm: IGLS Elapsed time : 0.03s #> Number of obs: 4059 (from total 4059) The model converged after 3 iterations. #> Log likelihood: -5727.9 #> Deviance statistic: 11455.7 #> --------------------------------------------------------------------------------------------------- #> The model formula: #> normexam ~ 1 + sex + (1 | student) #> Level 1: student #> --------------------------------------------------------------------------------------------------- #> The fixed part estimates: #> Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] #> Intercept -0.14035 0.02463 -5.70 1.209e-08 *** -0.18862 -0.09208 #> sexgirl 0.23367 0.03179 7.35 1.985e-13 *** 0.17136 0.29598 #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> --------------------------------------------------------------------------------------------------- #> The random part estimates at the student level: #> Coef. Std. Err. #> var_Intercept 0.98454 0.02185 #> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- We can see the output is in several sections. The first section tells us about how mlnscript, which estimation algorithm it used, hwo long it took to fit the model, and some characteristics of the dataset. The second section tells us about the model, in this case a random intercept model. The third section is the fixed effect estimates and the associated statistical inference for them. The fourth section is the random effect variance estimates. And we can continue with our multilevel modelling as we like. Summary Despite having Win in its name, MLwiN is available as a command line program, mlnscript, which is available on operating systems other than Windows (and indeed with other architectures), including macOS for both Intel and Apple Silicon processors and various Linux and Unix distributions (CentOS, Debian, Fedora, FreeBSD, Rocky, and Ubuntu). This is straightforward to use from R via the R2MLwiN package. References Charlton, C., J. Rasbash, W. J. Browne, M. Healy, and B. Cameron. 2024. MLwiN Version 3.10. Bristol, UK: Centre for Multilevel Modelling, University of Bristol. https://www.bristol.ac.uk/cmm/software/mlwin/. Zhang, Z., R. M. A. Parker, C. M. J. Charlton, G. Leckie, and W. J. Browne. 2016. “R2MLwiN: A Package to Run MLwiN from Within R.” Journal of Statistical Software 72 (10): 1–43. https://doi.org/10.18637/jss.v072.i10. This post is essentially a more detailed explanation of the advice given on the MLwiN website, here and here.↩︎

gssr is now two packages: gssr and gssrdoc | R-bloggers

Summary My gssr package is now two packages: gssr and gssrdoc. They’re also available as binary packages via R-Universe which means they will install much faster. The GSS is a big survey with a big codebook. Distributing it as an R package poses a few challenges. It’s too big for CRAN, of course, but that’s fine because CRAN is not a repository for datasets in any case. For some time, my gssr package has bundled the main data file, the panel datasets, and functions for getting the file for a particular year directly from NORC. Recently, I started integrating the codebook—or at least, summaries of every variable in the 1972-2022 data file—into the package. It’s a handy feature. It lets you look up GSS variables as if they were R functions: Looking up a GSS variable The main downside to doing this is that it makes a large package even larger. In addition, it takes a fair amount of time to install from source because more than 6,500 variables have to be documented during the installation. Providing binary packages would be much better. R OpenSci’s R-Universe provides a package-building service that rests on a bunch of GitHub Actions. But the resource constraints of GitHub’s runners meant that building a source package would fail on Ubuntu (specifically), and this meant that I couldn’t use it. To get around this I have split the package in two. There’s now gssr, which has the datasets (and the ability to fetch yearly datasets) exactly as before, and gssrdoc, which provides the integrated help. They are fully independent of one another. If you install both, you get exactly what gssr used to give you by itself. I think splitting them like this is worth it just because R-Universe can build package binaries of each now, and this means installation is much faster and you can use install.packages(). To install both, do: r 1 2 3 4 5 6 7 # Install 'gssr' from 'ropensci' universe install.packages('gssr', repos = c('https://kjhealy.r-universe.dev', 'https://cloud.r-project.org')) # Also recommended: install 'gssrdoc' as well install.packages('gssrdoc', repos = c('https://kjhealy.r-universe.dev', 'https://cloud.r-project.org')) You can of course permanently add my or any other R-Universe repo to the default list of repos that install.packages() will search by using options() either in a project or in your .Rprofile. The R-Universe help repo has some additional details. Note that if you install both packages you can just load library(gssr), but if you don’t want to load gssrdoc you can still query it at the console with e.g. ??polviews or ?gssrdoc::fefam.