The GitHub Blog

Commits, compare views, and pull requests now highlight individual changed words instead of the entire changed section, making it easier for you to see exactly what’s been added or removed.

And, of course, it works great with split diffs, too:

Diffs now come in two flavors, unified and split. Switch between them on pull request, commit, and compare pages using the toggle in the top right of the page. The mode you last used will become your preferred default.

At GitHub we say, "it's not fully shipped until it's fast." We've talked before about some of the ways we keep our frontend experience speedy, but that's only part of the story. Our MySQL database infrastructure dramatically affects the performance of GitHub.com. Here's a look at how our infrastructure team seamlessly conducted a major MySQL improvement last August and made GitHub even faster.

The mission

Last year we moved the bulk of GitHub.com's infrastructure into a new datacenter with world-class hardware and networking. Since MySQL forms the foundation of our backend systems, we expected database performance to benefit tremendously from an improved setup. But creating a brand-new cluster with brand-new hardware in a new datacenter is no small task, so we had to plan and test carefully to ensure a smooth transition.

Preparation

A major infrastructure change like this requires measurement and metrics gathering every step of the way. After installing base operating systems on our new machines, it was time to test out our new setup with various configurations. To get a realistic test workload, we used tcpdump to extract SELECT queries from the old cluster that was serving production and replayed them onto the new cluster.

MySQL tuning is very workload specific, and well-known configuration settings like innodb_buffer_pool_size often make the most difference in MySQL's performance. But on a major change like this, we wanted to make sure we covered everything, so we took a look at settings like innodb_thread_concurrency, innodb_io_capacity, and innodb_buffer_pool_instances, among others.

We were careful to only make one test configuration change at a time, and to run tests for at least 12 hours. We looked for query response time changes, stalls in queries per second, and signs of reduced concurrency. We observed the output of SHOW ENGINE INNODB STATUS, particularly the SEMAPHORES section, which provides information on work load contention.

Once we were relatively comfortable with configuration settings, we started migrating one of our largest tables onto an isolated cluster. This served as an early test of the process, gave us more space in the buffer pools of our core cluster and provided greater flexibility for failover and storage. This initial migration introduced an interesting application challenge, as we had to make sure we could maintain multiple connections and direct queries to the correct cluster.

In addition to all our raw hardware improvements, we also made process and topology improvements: we added delayed replicas, faster and more frequent backups, and more read replica capacity. These were all built out and ready for go-live day.

Making a list; checking it twice

With millions of people using GitHub.com on a daily basis, we did not want to take any chances with the actual switchover. We came up with a thorough checklist before the transition:

We also planned a maintenance window and announced it on our blog to give our users plenty of notice.

Migration day

At 5am Pacific Time on a Saturday, the migration team assembled online in chat and the process began:

We put the site in maintenance mode, made an announcement on Twitter, and set out to work through the list above:

13 minutes later, we were able to confirm operations of the new cluster:

Then we flipped GitHub.com out of maintenance mode, and let the world know that we were in the clear.

Lots of up front testing and preparation meant that we kept the work we needed on go-live day to a minimum.

Measuring the final results

In the weeks following the migration, we closely monitored performance and response times on GitHub.com. We found that our cluster migration cut the average GitHub.com page load time by half and the 99th percentile by two-thirds:

What we learned

Functional partitioning

During this process we decided that moving larger tables that mostly store historic data to separate cluster was a good way to free up disk and buffer pool space. This allowed us to leave more resources for our "hot" data, splitting some connection logic to enable the application to query multiple clusters. This proved to be a big win for us and we are working to reuse this pattern.

Always be testing

You can never do too much acceptance and regression testing for your application. Replicating data from the old cluster to the new cluster while running acceptance tests and replaying queries were invaluable for tracing out issues and preventing surprises during the migration.

The power of collaboration

Large changes to infrastructure like this mean a lot of people need to be involved, so pull requests functioned as our primary point of coordination as a team. We had people all over the world jumping in to help.

Deploy day team map:

This created a workflow where we could open a pull request to try out changes, get real-time feedback, and see commits that fixed regressions or errors -- all without phone calls or face-to-face meetings. When everything has a URL that can provide context, it's easy to involve a diverse range of people and make it simple for them give feedback.

One year later..

A full year later, we are happy to call this migration a success — MySQL performance and reliability continue to meet our expectations. And as an added bonus, the new cluster enabled us to make further improvements towards greater availability and query response times. I'll be writing more about those improvements here soon.

We've just released some major improvements to our organization audit logs. As an organization admin, you can now see a running list of events as they're generated across your organization, or you can search for specific activities performed by the members of your org. This data provides you with better security insights and gives you the ability to audit account, team, and repository access over time.

The audit log exposes a number of events like repository deletes, billing updates, new member invites, and team creation. You can see the activities of individual team members, along with a map that highlights the location where events originated. Using the new query interface, you can then filter all these events by the action performed, the team member responsible, the date, repository, and location.

For more information on the audit log, check out the documentation.

You'll now start seeing expanded file listings on GitHub that look like this:

The grey text in the paths — in this case, java/com/netflix/ — means those three folders don't contain any other files. Click on the expanded path to save yourself extra clicks and jump directly to the first non-empty directory.

As a reminder, you can also type t to invoke the File Finder and jump directly to any file you like.

Sometimes, you just want to grab someone's attention when you're finished with some cool code. That's why we've added support for GitHub's @mention feature inside GitHub for Windows. You can now @ mention repository collaborators, and when you publish your changes they'll be notified that you'd like them to have a look.

If you already have GitHub for Windows installed, you can update by selecting 'About GitHub for Windows' in the gear menu on the top right. Otherwise, download the latest version from the GitHub for Windows website.

In the summer of 2009, The New York Senate was the first government organization to post code to GitHub, and that fall, Washington DC quickly followed suit. By 2011, cities like Miami, Chicago, and New York; Australian, Canadian, and British government initiatives like GOV.UK; and US Federal agencies like the Federal Communications Commission, General Services Administration, NASA, and Consumer Financial Protection Bureau were all coding in the open as they began to reimagine government for the 21st century.

Fast forward to just last year: The White House Open Data Policy is published as a collaborative, living document, San Francisco laws are now forkable, and government agencies are accepting pull requests from every day developers.

This is all part of a larger trend towards government adopting open source practices and workflows — a trend that spans not only software, but data, and policy as well — and the movement shows no signs of slowing, with government usage on GitHub nearly tripling in the past year, to exceed 10,000 active government users today.

How government uses GitHub

When government works in the open, it acknowledges the idea that government is the world's largest and longest-running open source project. Open data efforts, efforts like the City of Philadelphia's open flu shot spec, release machine-readable data in open, immediately consumable formats, inviting feedback (and corrections) from the general public, and fundamentally exposing who made what change when, a necessary check on democracy.

Unlike the private sector, however, where open sourcing the "secret sauce" may hurt the bottom line, with government, we're all on the same team. With the exception of say, football, Illinois and Wisconsin don't compete with one another, nor are the types of challenges they face unique. Shared code prevents reinventing the wheel and helps taxpayer dollars go further, with efforts like the White House's recently released Digital Services Playbook, an effort which invites every day citizens to play a role in making government better, one commit at a time.

However, not all government code is open source. We see that adopting these open source workflows for open collaboration within an agency (or with outside contractors) similarly breaks down bureaucratic walls, and gives like-minded teams the opportunity to work together on common challenges.

Government Today

It's hard to believe that what started with a single repository just five years ago, has blossomed into a movement where today, more than 10,000 government employees use GitHub to collaborate on code, data, and policy each day.

Those 10,000 active users make up nearly 500 government organizations, from more than 50 countries:

Government code on GitHub spans more than 7,500 repositories with @alphagov, @NCIP, @GSA, and @ministryofjustice being the top open source contributors with more than 100 public repositories each:

You can learn more about GitHub in government at government.github.com, and if you're a government employee, be sure to join our semi-private peer group to learn best practices for collaborating on software, data, and policy in the open.

Happy collaborative governing!

So many of us here at GitHub have benefited from early exposure to science, technology, engineering, and mathematics that we're always looking for ways to help young people develop a genuine interest in technical fields.

We can't think of a better (or more fun) way to help inspire a life-long love of science than to encourage students to experiment with robotics. That's why we're proud to be a sponsor of FIRST (For Inspiration and Recognition of Science and Technology).

FIRST Robotics Competition

Every year, FIRST brings together coaches, industry mentors, and volunteers to help students from all over the world learn by building robots. Its oldest program, the FIRST Robotics Competition (FRC), is geared towards high school students. In 2014, over 50,000 students on more than 2,000 teams participated in FRC.

This year's competition had teams build robots that could transport balls and score goals, with the assistance of a human player. We traveled to St. Louis, MO, for the world championship and made a video about the competition:

FIRST Robotics on GitHub

GitHub supports FIRST to help give students and teachers first-hand experience with software development tools used in the industry. Individual teams host their sites with GitHub Pages, students collaborate on control and vision code across teams, mentors teach code review, teams release applications for scouting, and at least one team (@iLiteRobotics) has released 3D models of all of their robot parts.

Here are just a few examples of how FIRST teams are using GitHub:

GitHub Pages Sites

Code on GitHub

3D Models

Get involved with FIRST

FIRST is a team effort. Coaches, industry mentors, and volunteers are all essential to the continued success of the organization. Here are some ways that you can get involved:

Individuals

Attend a local FIRST event.
Volunteer with an team or regional competition.
Start a new team.

Organizations

Encourage your employees to volunteer and mentor.
Sponsor a local team, especially the teams your employees volunteer with.
Offer a scholarship to participants.
Become a kit of parts supplier.

Our traffic graphs tab shows you a lot of information about who's visiting your repository on the web. We've added a new graph to this tab, showing git clone activity.

You can use it to find out how many times your repository's source code is actually cloned in a given day, as well as how many unique GitHub users (or anonymous IP addresses) did the cloning.

For more information on traffic graphs, check out the documentation.

We've changed the process for adding new GitHub users to your organization. Starting today, users you add will be sent an email invitation. Once they accept this invitation, they'll become a member of your organization.

If you invite a user and they misplace their invitation email, they can always access the invitation from your organization's profile page.

Everyone on GitHub should be able to decide which organizations they'd like to join. This new process reinforces each person's privacy and security.

For more information on inviting people to an organization, check out our documentation. If you're using our API to add people to your organization, check out the new Team Memberships API.

We've upgraded GitHub Pages to support the latest version of Jekyll, the open source static site generator. Whether you're a new user or a savvy veteran, here are a few features that might help make publishing your next site a bit easier:

Native Sass & CoffeeScript support - Simply commit a .coffee, .sass or .scss file to your site's repository, and GitHub Pages will transparently output JavaScript or CSS when your site is published.
Kramdown as the default Markdown engine - In addition to better error handling, Kramdown supports MathJax, fenced code blocks, nested lists, tables, and much more.
Collections - With collections, Jekyll is no longer limited to just posts and pages—it can now publish all kinds of different documents, such as code methods, team members, or your favorite open source projects.
JSON data - .json files in the _data directory now get read in and are exposed to Liquid templates as the site.data namespace (along with .yml files).

Under the hood there's also some great time savers such as front-matter defaults, the where and group_by filters, and a new starter site. Check out the full list of 300+ changes and new features added to Jekyll since version 1.5.1.

If you use Jekyll locally, simply run gem update github-pages, or follow these instructions to update your local build environment to the latest version.

Happy publishing!

We've rebuilt GitHub Issues to be smarter: search smarter, filter smarter, and manage your issues and pull requests smarter.

If you want to see it in action, check out Bootstrap's issues. To learn more, read on.

Search and filter

A big part of managing your issues and pull requests is focusing on what needs to happen next. The new search box at the top of the page gets you there faster:

You can filter your search results by author, label, milestone, and open/close state. You can also use any of our advanced search terms to find just what you're after.

Watch an issue evolve

Over time, titles change, labels and milestones get closer to completion, and issues get new owners. Now you have better insight into these changes.

Pull requests also make use of our new Deployments API, which lets you know exactly when a pull request has made it to your testing, staging, and production environments:

The new labels & milestones pages

Labels and milestones can help with managing a project's issues, but it's also important to make sure you can manage the labels and milestones themselves. Two new pages offer a better vantage point into the overall health of your project:

The new labels page (example):

...and an updated milestones page (example):

All the small things

There's a slew of smaller changes that went into this release of Issues as well:

You'll get a notification if an issue is assigned to you.
No more mixing: the "issues" tab will only show you issues, and the pull requests tab will still only show pull requests. Want to see them together again? Just remove the is:issue or is:pr filter from your search query.
If you use Task Lists, we'll show the overall progress on that issue or pull request on the listing page:
You can add labels and assign pull requests to milestones even if you have issues disabled on your repository.
New keyboard shortcuts mean it's quick to filter down to what you want. Type ? on an issues listing to get a list of the available keyboard shortcuts.
You can now triage multiple pull requests at once by selecting them and changing their label, assignment, state, or milestone, just like issues.

Learn more about Issues

Check out our updated guide on Mastering Issues to learn more about workflows and how to make issues work for you. And, of course, we've updated our help documentation for the new GitHub Issues, so if you run into any problems, be sure to give them a peek.

A better Issues

Software is about getting things done: either by doing the work, or planning out how to do the work. We hope the new GitHub Issues gets you there quicker and happier.

After we shipped the ability to view GeoJSON & TopoJSON, users have put tons of cool maps on GitHub, but sometimes you still need to see the underlying GeoJSON. Now you can! Map files can now be toggled between their source code and their map rendered representation.

GitHub's annual data challenge is back, and we can't wait to see what you'll build this year, be it beautiful generative art or full blown, third-party activity dashboards. Check out the winners from 2013 and 2012 for some inspiration.

The Details

Entries are generally visualizations, prose descriptions of data analyses, or both. We love innovative entries, so an "entry" is defined somewhat loosely.

There are only three rules:

To enter, you must fill out our submission form by midnight PDT on August 25th, 2014.
Your entry needs to use publicly available GitHub data from any number of available sources described below.
Show your work! Whatever you submit needs associated code or documentation describing what data you used and how you processed it. Some examples of what we're looking for include code (and instructions to use it) in a GitHub repository, an academic write-up of your analysis, or an informal prose write-up. If you're not linking to a repository, you should submit a Gist with your documentation.

After the submission deadline on August 25th, GitHub employees will review and vote on all entries to pick the three top winners. We'll send out notifications to those top three by mid-September.

Data Sources

GitHub activity data is available from several publicly-available sources. Here are a few links to get you started:

Our very own API.
The GitHub Archive, providing historical archives of our public timeline data.
Google BigQuery, where GitHub's public timeline is a featured public dataset; see the GitHub Archive home page for getting started instructions.
GHTorrent, which maintains a relational model of GitHub activity data and offers archives for download.

ProTips

There are a few things we're looking for when we score your entry:

Innovation/Story: Does your entry tell a good, data-driven story? Does it reveal interesting insights about GitHub activity? We love it when we're surprised by new insights hidden in our own data.
Accuracy: Is your analysis accurate? Do accompanying visualizations clearly and unambiguously convey your conclusions?
Completeness: Is your entry a code submission? If so, is your code well-organized and documented? Can others easily understand and reproduce your analysis from the materials you've submitted?

The Prizes

The winning entry in this year's data challenge will receive an all-expense paid trip to attend a one-day data visualization course taught by Edward Tufte, a data visualization expert and the author of some of our favorite books on visualization. We'll cover your enrollment for the course (either December 18th or 19th in San Francisco, CA), along with travel expenses to and from San Francisco, lodging at a nearby hotel for two nights (the evening before and of the course), and your meals.

The second and third prize contestants will receive $500 and $250 cash prizes, respectively.

Finally, all winners will have their GitHub profile and their data challenge entry publicly featured on our blog!

If you have questions about the data challenge rules, drop us a line at data@github.com. Good luck!

Following the recent release of GitHub for Windows 2.0, we’ve been working hard to bring our two desktop apps closer together.

We’ve just shipped a significant new update to GitHub for Mac, with simplified navigation and a renewed focus on your cloned repositories.

With this update, you’ll be able to spend less time navigating lists of respositories, and more time focusing on your repositories and your branches.

Simplified Navigation

The sidebar now features all your repositories grouped by their origin, and the new toolbar lets you create, clone, and publish additional repositories quickly. You can also press ⇧⌘O to filter local repositories from those associated with GitHub or GitHub Enterprise, and switch between them.

Cloning repositories from GitHub

Fewer steps are required to clone repositories from GitHub Enterprise or GitHub.com. You can now press ⌃⌘O, type the repository name, and then press Enter to clone the repositories you want and need.

Switching and creating new branches

The branch popover (⌘B) has moved to the new toolbar, and now has a “Recent Branches” section that provides a convenient way to switch between all of your in-progress branches.

Branch creation (⇧⌘N) has moved to its own popover, and you can now create a new branch from any existing branch.

How do I get it?

GitHub for Mac will automatically update itself to the latest version. To update right away, open the “GitHub” menu, then “Check For Updates…”, or visit mac.github.com to download the latest release.

NOTE: This release and future releases of GitHub for Mac require OS X 10.8 or later. If you are still running OS X 10.7, you will not be updated to this release.

Feedback

We’d love to hear what you think about this release. If you have any comments, questions or straight-up bug reports, please get in touch.

Aug	SEP	Oct
	10
2013	2014	2015