The Wayback Machine - https://web.archive.org/web/20170131191858/https://github.com/blog/category/engineering
Skip to content

New and improved two-factor lockout recovery process

Starting January 31, 2017, the Recover Accounts Elsewhere feature will let you associate your GitHub account with your Facebook account, giving you a way back into GitHub in certain two-factor authentication lockout scenarios. If you've lost your phone or have otherwise lost the ability to use your phone or token without a usable backup, you can recover your account through Facebook and get back to work. See how the new recovery feature works on the GitHub Engineering Blog.

Image of recovery screen on Facebook

Currently, if you lose the ability to authenticate with your phone or token, you have to prove account ownership before we can disable two-factor authentication. Proving ownership requires access to a confirmed email address and a valid SSH private key for a given account. This feature will provide an alternative proof of account ownership that can be used along with these other methods.

To set up the new recovery option, save a token on the security settings page on GitHub. Then confirm that you'd like store the token. If you get locked out for any reason, you can contact GitHub Support, log in to Facebook, and start the recovery process.

Image of recovery option on GitHub

Bug Bounty anniversary promotion: bigger bounties in January and February

Extra payouts for GitHub Bug Bounty Third Year Anniversary

The GitHub Bug Bounty Program is turning three years old. To celebrate, we're offering bigger bounties for the most severe bugs found in January and February.

The bigger the bug, the bigger the prize

The process is the same as always: hackers and security researchers find and report vulnerabilities through our responsible disclosure process. To recognize the effort these researchers put forth, we reward them with actual money. Standard bounties range between $500 and $10,000 USD and are determined at our discretion, based on overall severity. In January and February we're throwing in bonus rewards for standout individual reports in addition to the usual payouts.

Bug bounty prizes are $12,000, $8,000, $5,000 on top of the usual payouts

And t-shirts obviously

In addition to cash prizes, we've also made limited edition t-shirts to thank you for helping us hunt down GitHub bugs. We don't have enough for everyone—just for the 15 submitters with the most severe bugs.

Enterprise bugs count, too

GitHub Enterprise is now included in the bounty program. So go ahead and find some Enterprise bugs. If they're big enough you'll be eligible for the promotional bounty. Otherwise, rewards are the same as GitHub.com ($200 to $10,000 USD). For more details, visit our bounty site.

Giving winners some extra cash doesn't mean anyone has to lose. If you find a bug, you'll still receive the standard bounties.

Happy hunting!

Incident Report: Inadvertent Private Repository Disclosure

On Thursday, October 20th, a bug in GitHub’s system exposed a small amount of user data via Git pulls and clones. In total, 156 private repositories of GitHub.com users were affected (including one of GitHub's). We have notified everyone affected by this private repository disclosure, so if you have not heard from us, your repositories were not impacted and there is no ongoing risk to your information.

This was not an attack, and no one was able to retrieve vulnerable data intentionally. There was no outsider involved in exposing this data; this was a programming error that resulted in a small number of Git requests retrieving data from the wrong repositories.

Regardless of whether or not this incident impacted you specifically, we want to sincerely apologize. It’s our responsibility not only to keep your information safe but also to protect the trust you have placed in us. GitHub would not exist without your trust, and we are deeply sorry that this incident occurred.

Below is the technical analysis of our investigation, including a high-level overview of the incident, how we mitigated it, and the specific measures we are taking to safeguard against incidents like this from happening in the future.

High-level overview

In order to speed up unicorn worker boot times, and simplify the post-fork boot code, we applied the following buggy patch:

diff

The database connections in our rails application are split into three pools: a read-only group, a group used by Spokes (our distributed Git back-end), and the normal Active Record connection pool. The read-only group and the Spokes group are managed manually, by our own connection handling code. This meant the pool was shared between all child processes of the rails application when running using the change. The new line of code disconnected only ConnectionPool objects that are managed by Active Record, whereas the previous snippet would disconnect all ConnectionPool objects held in memory.

The impact of this bug for most queries was a malformed response, which errored and caused a near immediate rollback. However, a very small percentage of the queries responses were interpreted as legitimate data in the form of the file server and disk path where repository data was stored. Some repository requests were routed to the location of another repository. The application could not differentiate these incorrect query results from legitimate ones, and as a result, users received data that they were not meant to receive.

When properly functioning, the system works as sketched out roughly below. However, during this failure window, the MySQL response in step 4 was returning malformed data that would end up causing the git proxy to return data from the wrong file server and path.

System Diagram

Our analysis of the ten-minute window in question uncovered:

  • 17 million requests to our git proxy tier, most of which failed with errors due to the buggy deploy
  • 2.5 million requests successfully reached git-daemon on our file server tier
  • Of the 2.5 million requests that reached our file servers, the vast majority were "already up to date" no-op fetches
  • 40,000 of the 2.5 million requests were non-empty fetches
  • 230 of the 40,000 non-empty requests were susceptible to this bug and served incorrect data
  • This represented 0.0013% of the total operations at the time

Deeper analysis and forensics

After establishing the effects of the bug, we set out to determine which requests were affected in this way for the duration of the deploy. Normally, this would be an easy task, as we have an in-house monitor for Git that logs every repository access. However, those logs contained some of the same faulty data that led to the misrouted requests in the first place. Without accurate usernames or repository names in our primary Git logs, we had to turn to data that our git proxy and git-daemon processes sent to syslog. In short, the goal was to join records from the proxy, to git-daemon, to our primary Git logging, drawing whatever data was accurate from each source. Correlating records across servers and data sources is a challenge because the timestamps differ depending on load, latency, and clock skew. In addition, a given Git request may be rejected at the proxy or by git-daemon before it reaches Git, leaving records in the proxy logs that don’t correlate with any records in the git-daemon or Git logs.

Ultimately, we joined the data from the proxy to our Git logging system using timestamps, client IPs, and the number of bytes transferred and then to git-daemon logs using only timestamps. In cases where a record in one log could join several records in another log, we considered all and took the worst-case choice. We were able to identify cases where the repository a user requested, which was recorded correctly at our git proxy, did not match the repository actually sent, which was recorded correctly by git-daemon.

We further examined the number of bytes sent for a given request. In many cases where incorrect data was sent, the number of bytes was far larger than the on-disk size of the repository that was requested but instead closely matched the size of the repository that was sent. This gave us further confidence that indeed some repositories were disclosed in full to the wrong users.

Although we saw over 100 misrouted fetches and clones, we saw no misrouted pushes, signaling that the integrity of the data was unaffected. This is because a Git push operation takes place in two steps: first, the user uploads a pack file containing files and commits. Then we update the repository’s refs (branch tips) to point to commits in the uploaded pack file. These steps look like a single operation from the user’s point of view, but within our infrastructure, they are distinct. To corrupt a Git push, we would have to misroute both steps to the same place. If only the pack file is misrouted, then no refs will point to it, and git fetch operations will not fetch it. If only the refs update is misrouted, it won’t have any pack file to point to and will fail. In fact, we saw two pack files misrouted during the incident. They were written to a temporary directory in the wrong repositories. However, because the refs-update step wasn’t routed to the same incorrect repository, the stray pack files were never visible to the user and were cleaned up (i.e., deleted) automatically the next time those repositories performed a “git gc” garbage-collection operation. So no permanent or user-visible effect arose from any misrouted push.

A misrouted Git pull or clone operation consists of several steps. First, the user connects to one of our Git proxies, via either SSH or HTTPS (we also support git-protocol connections, but no private data was disclosed that way). The user’s Git client requests a specific repository and provides credentials, an SSH key or an account password, to the Git proxy. The Git proxy checks the user’s credentials and confirms that the user has the ability to read the repository he or she has requested. At this point, if the Git proxy gets an unexpected response from its MySQL connection, the authentication (which user is it?) or authorization (what can they access?) check will simply fail and return an error. Many users were told during the incident that their repository access “was disabled due to excessive resource use.”

In the operations that disclosed repository data, the authentication and authorization step succeeded. Next, the Git proxy performs a routing query to see which file server the requested repository is on, and what its file system path on that server will be. This is the step where incorrect results from MySQL led to repository disclosures. In a small fraction of cases, two or more routing queries ran on the same Git proxy at the same time and received incorrect results. When that happened, the Git proxy got a file server and path intended for another request coming through that same proxy. The request ended up routed to an intact location for the wrong repository. Further, the information that was logged on the repository access was a mix of information from the repository the user requested and the repository the user actually got. These corrupted logs significantly hampered efforts to discover the extent of the disclosures.

Once the Git proxy got the wrong route, it forwarded the user’s request to git-daemon and ultimately Git, running in the directory for someone else’s repository. If the user was retrieving a specific branch, it generally did not exist, and the pull failed. But if the user was pulling or cloning all branches, that is what they received: all the commits and file objects reachable from all branches in the wrong repository. The user (or more often, their build server) might have been expecting to download one day’s commits and instead received some other repository’s entire history.

Users who inadvertently fetched the entire history of some other repository, surprisingly, may not even have noticed. A subsequent “git pull” would almost certainly have been routed to the right place and would have corrected any overwritten branches in the user’s working copy of their Git repository. The unwanted remote references and tags are still there, though. Such a user can delete the remote references, run “git remote prune origin,” and manually delete all the unwanted tags. As a possibly simpler alternative, a user with unwanted repository data can delete that whole copy of the repository and “git clone” it again.

Next steps

To prevent this from happening again, we will modify the database driver to detect and only interpret responses that match the packet IDs sent by the database. On the application side, we will consolidate the connection pool management so that Active Record's connection pooling will manage all connections. We are following this up by upgrading the application to a newer version of Rails that doesn't suffer from the "connection reuse" problem.

We will continue to analyze the events surrounding this incident and use our investigation to improve the systems and processes that power GitHub. We consider the unauthorized exposure of even a single private repository to be a serious failure, and we sincerely apologize that this incident occurred.

GitHub Extension for Visual Studio 2.0 is now available

We're pleased to announce that version 2.0 of the GitHub Extension for Visual Studio is now available. You can install it directly from the Tools and Extensions gallery in Visual Studio, as well as from our releases page and our website.

This release allows you to list your existing pull requests and create new ones directly from Visual Studio. You can also create gists directly from your code, making it even easier to share your code and collaborate. It also includes a slew of other enhancements and fixes, available in our release notes.

Screenshot of Visual Studio with GitHub pane open

As we focus our efforts on building workflows targeted towards maintainers and contributors, we'd love to hear your thoughts on how we can improve our tools. Let us know what works and what doesn't and how to make your coding and collaboration workflows easier.

GitHub Pages to upgrade to Jekyll 3.2

GitHub Pages will upgrade to Jekyll 3.2 on August 23rd.

The upgrade to Jekyll 3.2 comes with over 100 improvements including the introduction of themes, meaning that soon, you'll be able to create a beautiful site in minutes by simply adding theme: my-awesome-theme to your site's config, without needing to copy styles or templates into your site's repository.

This should be a seamless transition for all GitHub Pages users, but if you have a particularly complex Jekyll site, we recommend building your site locally with the latest version of Jekyll 3.2.x prior to August 23rd to ensure your site continues to build as expected.

For more information, see the Jekyll changelog and if you have any questions, we encourage you to get in touch with us.

GitHub Desktop now has a dark side

GitHub Desktop on Windows is a nice complement to developer tools such as Atom and Visual Studio. Now it visually complements those tools too! The latest update adds the ability to select a new dark theme.

GitHub Desktop with dark theme enabled

You can access this setting from the Options menu in GitHub Desktop.

GitHub Pages to upgrade to Jekyll 3.1.4

GitHub Pages will upgrade to the soon-to-be-released Jekyll 3.1.4 on May 23rd. The Jekyll 3.1.x branch brings significant performance improvements to the build process, adds a handful of helpful Liquid filters, and fixes a few minor bugs.

This should be a seamless transition for all GitHub Pages users, but if you have a particularly complex Jekyll site, we recommend building your site locally with the latest version of Jekyll 3.1.x prior to May 18th to ensure your site continues to build as expected.

For more information, see the Jekyll changelog and if you have any questions, we encourage you to get in touch with us.

Edit: To ensure a smooth transition for users, the upgrade has been rescheduled to May 23rd.

Expanded webhook events

Webhooks are one of the more powerful ways to extend GitHub. They allow internal tools and third-party integrations to subscribe to specific activity on GitHub and receive notifications (via an HTTP POST) to an external web server when those events happen. They are often used to trigger CI builds, deploy applications, or update external bug trackers.

Based on your feedback, we've expanded the kinds of events to which you can subscribe. New events include:

  • Editing an issue or pull request's title or body
  • Changing a repository's visibility to public or private
  • Deleting a repository
  • Editing an issue comment, pull request comment, or review comment
  • Deleting an issue comment, pull request comment, or review comment

When the action in question was an edit, the webhook's payload will helpfully point out what changed.

changed payload

These expanded webhook events are now available on GitHub.com. For more information, check out the developer blog and take a look at the documentation for a full list of webhook events.

Delivering Octicons with SVG

GitHub.com no longer delivers its icons via icon font. Instead, we’ve replaced all the Octicons throughout our codebase with SVG alternatives. While the changes are mostly under-the-hood, you’ll immediately feel the benefits of the SVG icons.

Octicon comparison

Switching to SVG renders our icons as images instead of text, locking nicely to whole pixel values at any resolution. Compare the zoomed-in icon font version on the left with the crisp SVG version on the right.

Why SVG?

Icon font rendering issues

Icon fonts have always been a hack. We originally used a custom font with our icons as unicode symbols. This allowed us to include our icon font in our CSS bundle. Simply adding a class to any element would make our icons appear. We could then change the size and color on the fly using only CSS.

Unfortunately, even though these icons were vector shapes, they’d often render poorly on 1x displays. In Webkit-based browsers, you’d get blurry icons depending on the browser’s window width. Since our icons were delivered as text, sub-pixel rendering meant to improve text legibility actually made our icons look much worse.

Page rendering improvements

Since our SVG is injected directly into the markup (more on why we used this approach in a bit), we no longer see a flash of unstyled content as the icon font is downloaded, cached, and rendered.

Jank

Accessibility

As laid out in Death to Icon Fonts, some users override GitHub’s fonts. For dyslexics, certain typefaces can be more readable. To those changing their fonts, our font-based icons were rendered as empty squares. This messed up GitHub’s page layouts and didn’t provide any meaning. SVGs will display regardless of font overrides. For screen readers, SVG provides us the ability to add pronouncable alt attributes, or leave them off entirely.

Properly sized glyphs

For each icon, we currently serve a single glyph at all sizes. Since the loading of our site is dependent on the download of our icon font, we were forced to limit the icon set to just the essential 16px shapes. This led to some concessions on the visuals of each symbol since we’d optimized for the 16px grid. When scaling our icons up in blankslates or marketing pages, we’re still showing the 16px version of the icon. With SVGs, we can easily fork the entire icon set and offer more appropriate glyphs at any size we specify. We could have done this with our icon fonts, but then our users would need to download twice as much data. Possibly more.

Ease of authoring

Building custom fonts is hard. A few web apps have popped up to solve this pain. Internally, we’d built our own. With SVG, adding a new icon could be as trivial as dragging another SVG file into a directory.

We can animate them

We’re not saying we should, but we could, though SVG animation does have some practical applications—preloader animations, for example.

How

Our Octicons appear nearly 2500 times throughout GitHub’s codebase. Prior to SVG, Octicons were included as simple spans <span class="octicon octicon-alert"></span>. To switch to SVG, we first added a Rails helper for injecting SVG paths directly into to our markup. Relying on the helper allowed us to test various methods of delivering SVG to our staff before enabling it for the public. Should a better alternative to SVG come along, or if we need to revert back to icon fonts for any reason, we’d only have to change the output of the helper.

Helper usage

Input
<%= octicon(:symbol => "plus") %>

Output

<svg aria-hidden="true" class="octicon octicon-plus" width="12" height="16" role="img" version="1.1" viewBox="0 0 12 16">
    <path d="M12 9H7v5H5V9H0V7h5V2h2v5h5v2z"></path>
</svg>

Our approach

You can see we’ve landed on directly injecting the SVGs directly in our page markup. This allows us the flexibility to change the color of the icons with CSS using the fill: declaration on the fly.

Instead of an icon font, we now have a directory of SVG shapes whose paths are directly injected into the markup by our helper based on which symbol we choose. For example, if we want an alert icon, we call the helper <%= octicon(:symbol => "alert") %>. It looks for the icon of the same file name and injects the SVG.

We tried a number of approaches when adding SVG icons to our pages. Given the constraints of GitHub’s production environment, some were dead-ends.

  1. External .svg — We first attempted to serve a single external “svgstore”. We’d include individual sprites using the <use> element. With our current cross-domain security policy and asset pipeline, we found it difficult to serve the SVG sprites externally.
  2. SVG background images — This wouldn’t let us color our icons on the fly.
  3. SVGs linked via <img> and the src attribute — This wouldn’t let us color our icons on the fly.
  4. Embedding the entire “svgstore” in every view and using <use> — It just didn’t feel quite right to embed every SVG shape we have on every single page throughout GitHub.com especially if a page didn’t include a single icon.

Performance

We’ve found there were no adverse effects on pageload or performance when switching to SVG. We’d hoped for a more dramatic drop in rendering times, but often performance has more to do with perception. Since SVG icons are being rendered like images in the page with defined widths and heights, the page doesn’t have nearly as much jank.

We were also able to kill a bit of bloat from our CSS bundles since we’re no longer serving the font CSS.

Drawbacks & Gotchas

  • Firefox still has pixel-rounding errors in SVG, though the icon font had the same issue.
  • You may have to wrap these SVGs with another div if you want to give them a background color.
  • Since SVG is being delivered as an image, some CSS overrides might need to be considered. If you see anything weird in our layouts, let us know.
  • Internet Explorer needs defined width and height attributes on the svg element in order for them to be sized correctly.
  • We were serving both SVG and our icon font during our transition. This would cause IE to crash while we were still applying font-family to each of the SVG icons. This was cleared up as soon as we transitioned fully to SVG.

TL;DR

By switching from icon fonts, we can serve our icons more easily, more quickly, and more accessibly. And they look better. Enjoy.

Two years of bounties

Despite the best efforts of its writers, software has vulnerabilities, and GitHub is no exception. Finding, fixing, and learning from past bugs is a critical part of keeping our users and their data safe on the Internet. Two years ago, we launched the GitHub Security Bug Bounty and it's been an incredible success. By rewarding the talented and dedicated researchers in the security industry, we discover and fix security vulnerabilities before they can be exploited.

Bugs squashed

Bounty Submissions Per Week

Of 7,050 submissions in the past two years, 1,772 warranted further review, helping us to identify and fix vulnerabilities spanning all of the OWASP top 10 vulnerability classifications. 58 unique researchers earned a cumulative $95,300 for the 102 medium to high risk vulnerabilities they reported. This chart shows the breakdown of payouts by severity and OWASP classification:

A1A2A3A4A5A6A7A8A9A10A0Sum
Low4511.518115733.5160
Medium21120115103026
High223.502100.500213
Critical300000000003
Sum1182711113108.546.53102

Highlights

We love it when a reported vulnerability ends up not being our fault. @kelunik and @bwoebi reported a browser vulnerability, causing GitHub's cookies to be sent to other domains. @ealf reported a browser bug, bypassing our JavaScript same-origin policy checks. We were able to protect our users from these vulnerabilities months before the browser vendors released patches.

Another surprising bug was reported by @cryptosense, who found that some RSA key generators were creating SSH keys that were trivially factorable. We ended up finding and revoking 309 weak RSA keys and now have validations checking if keys are factorable by the first 10,000 primes.

In the first year of the bounty program, we saw reports mostly about our web services. In 2015, we received a number of reports for vulnerabilities in our desktop apps. @tunz reported a clever exploit against GitHub for Mac, allowing remote code execution. Shortly thereafter, @joernchen reported a similar bug in GitHub for Windows, following up a few months later with a separate client-side remote code execution vulnerability in Git Large File Storage (LFS).

Hacking with purpose

In 2015 we saw an amazing increase in the number of bounties donated to a good cause. GitHub matches bounties donated to 501(c)(3) organizations, and with the help of our researchers we contributed to the EFF, Médecins Sans Frontières, the Ada Initiative, the Washington State Burn Foundation, and the Tor Project. A big thanks to @ealf, @LukasReschke, @arirubinstein, @cryptosense, @bureado, @vito, and @s-rah for their generosity.

Get involved

In the first two years of the program, we paid researchers nearly $100,000. That's a great start, but we hope to further increase participation in the program. So, fire up your favorite proxy and start poking at GitHub.com. When you find a vulnerability, report it and join the ranks of our leaderboard. Happy hacking!

January 28th Incident Report

Last week GitHub was unavailable for two hours and six minutes. We understand how much you rely on GitHub and consider the availability of our service one of the core features we offer. Over the last eight years we have made considerable progress towards ensuring that you can depend on GitHub to be there for you and for developers worldwide, but a week ago we failed to maintain the level of uptime you rightfully expect. We are deeply sorry for this, and would like to share with you the events that took place and the steps we’re taking to ensure you're able to access GitHub.

The Event

At 00:23am UTC on Thursday, January 28th, 2016 (4:23pm PST, Wednesday, January 27th) our primary data center experienced a brief disruption in the systems that supply power to our servers and equipment. Slightly over 25% of our servers and several network devices rebooted as a result. This left our infrastructure in a partially operational state and generated alerts to multiple on-call engineers. Our load balancing equipment and a large number of our frontend applications servers were unaffected, but the systems they depend on to service your requests were unavailable. In response, our application began to deliver HTTP 503 response codes, which carry the unicorn image you see on our error page.

Our early response to the event was complicated by the fact that many of our ChatOps systems were on servers that had rebooted. We do have redundancy built into our ChatOps systems, but this failure still caused some amount of confusion and delay at the very beginning of our response. One of the biggest customer-facing effects of this delay was that status.github.com wasn't set to status red until 00:32am UTC, eight minutes after the site became inaccessible. We consider this to be an unacceptably long delay, and will ensure faster communication to our users in the future.

Initial notifications for unreachable servers and a spike in exceptions related to Redis connectivity directed our team to investigate a possible outage in our internal network. We also saw an increase in connection attempts that pointed to network problems. While later investigation revealed that a DDoS attack was not the underlying problem, we spent time early on bringing up DDoS defenses and investigating network health. Because we have experience mitigating DDoS attacks, our response procedure is now habit and we are pleased we could act quickly and confidently without distracting other efforts to resolve the incident.

With our DDoS shields up, the response team began to methodically inspect our infrastructure and correlate these findings back to the initial outage alerts. The inability to reach all members of several Redis clusters led us to investigate uptime for devices across the facility. We discovered that some servers were reporting uptime of several minutes, but our network equipment was reporting uptimes that revealed they had not rebooted. Using this, we determined that all of the offline servers shared the same hardware class, and the ones that rebooted without issue were a different hardware class. The affected servers spanned many racks and rows in our data center, which resulted in several clusters experiencing reboots of all of their member servers, despite the clusters' members being distributed across different racks.

As the minutes ticked by, we noticed that our application processes were not starting up as expected. Engineers began taking a look at the process table and logs on our application servers. These explained that the lack of backend capacity was a result of processes failing to start due to our Redis clusters being offline. We had inadvertently added a hard dependency on our Redis cluster being available within the boot path of our application code.

By this point, we had a fairly clear picture of what was required to restore service and began working towards that end. We needed to repair our servers that were not booting, and we needed to get our Redis clusters back up to allow our application processes to restart. Remote access console screenshots from the failed hardware showed boot failures because the physical drives were no longer recognized. One group of engineers split off to work with the on-site facilities technicians to bring these servers back online by draining the flea power to bring them up from a cold state so the disks would be visible. Another group began rebuilding the affected Redis clusters on alternate hardware. These efforts were complicated by a number of crucial internal systems residing on the offline hardware. This made provisioning new servers more difficult.

Once the Redis cluster data was restored onto standby equipment, we were able to bring the Redis-server processes back online. Internal checks showed application processes recovering, and a healthy response from the application servers allowed our HAProxy load balancers to return these servers to the backend server pool. After verifying site operation, the maintenance page was removed and we moved to status yellow. This occurred two hours and six minutes after the initial outage.

The following hours were spent confirming that all systems were performing normally, and verifying there was no data loss from this incident. We are grateful that much of the disaster mitigation work put in place by our engineers was successful in guaranteeing that all of your code, issues, pull requests, and other critical data remained safe and secure.

Future Work

Complex systems are defined by the interaction of many discrete components working together to achieve an end result. Understanding the dependencies for each component in a complex system is important, but unless these dependencies are rigorously tested it is possible for systems to fail in unique and novel ways. Over the past week, we have devoted significant time and effort towards understanding the nature of the cascading failure which led to GitHub being unavailable for over two hours. We don’t believe it is possible to fully prevent the events that resulted in a large part of our infrastructure losing power, but we can take steps to ensure recovery occurs in a fast and reliable manner. We can also take steps to mitigate the negative impact of these events on our users.

We identified the hardware issue resulting in servers being unable to view their own drives after power-cycling as a known firmware issue that we are updating across our fleet. Updating our tooling to automatically open issues for the team when new firmware updates are available will force us to review the changelogs against our environment.

We will be updating our application’s test suite to explicitly ensure that our application processes start even when certain external systems are unavailable and we are improving our circuit breakers so we can gracefully degrade functionality when these backend services are down. Obviously there are limits to this approach and there exists a minimum set of requirements needed to serve requests, but we can be more aggressive in paring down the list of these dependencies.

We are reviewing the availability requirements of our internal systems that are responsible for crucial operations tasks such as provisioning new servers so that they are on-par with our user facing systems. Ultimately, if these systems are required to recover from an unexpected outage situation, they must be as reliable as the system being recovered.

A number of less technical improvements are also being implemented. Strengthening our cross-team communications would have shaved minutes off the recovery time. Predefining escalation strategies during situations that require all hands on deck would have enabled our incident coordinators to spend more time managing recovery efforts and less time navigating documentation. Improving our messaging to you during this event would have helped you better understand what was happening and set expectations about when you could expect future updates.

In Conclusion

We realize how important GitHub is to the workflows that enable your projects and businesses to succeed. All of us at GitHub would like to apologize for the impact of this outage. We will continue to analyze the events leading up to this incident and the steps we took to restore service. This work will guide us as we improve the systems and processes that power GitHub.

Update on 1/28 service outage

On Thursday, January 28, 2016 at 00:23am UTC, we experienced a severe service outage that impacted GitHub.com. We know that any disruption in our service can impact your development workflow, and are truly sorry for the outage. While our engineers are investigating the full scope of the incident, I wanted to quickly share an update on the situation with you.

A brief power disruption at our primary data center caused a cascading failure that impacted several services critical to GitHub.com's operation. While we worked to recover service, GitHub.com was unavailable for two hours and six minutes. Service was fully restored at 02:29am UTC. Last night we completed the final procedure to fully restore our power infrastructure.

Millions of people and businesses depend on GitHub. We know that our community feels the effects of our site going down deeply. We’re actively taking measures to improve our resilience and response time, and will share details from these investigations.

GitHub implements Subresource Integrity

With Subresource Integrity (SRI), using GitHub is safer than ever. SRI tells your browser to double check that our Content Delivery Network (CDN) is sending the right JavaScript and CSS to your browser. Without SRI, an attacker who is able to compromise our CDN could send malicious JavaScript to your browser. To get the benefits of SRI, make sure you're using a modern browser like Google Chrome.

New browser security features like SRI are making the web a safer place. They don't do much good if websites don't implement them though. We're playing our role, and encourage you to consider doing the same.

You can read more about Subresource Integrity and why we implemented it on the GitHub Engineering blog.

GitHub Desktop is now available

The new GitHub Desktop is now available. It's a fast, easy way to contribute to projects from OS X and Windows. Whether you're new to GitHub or a seasoned user, GitHub Desktop is designed to simplify essential steps in your GitHub workflow and replace GitHub for Mac and Windows with a unified experience across both platforms.

desktop

Branch off

Branches are essential to proposing changes and reviewing code on GitHub, and they’re always available in GitHub Desktop’s repository view. Just select the current branch to switch branches or create a new one.

Collaborate

Craft the perfect commit by selecting the files—or even the specific lines—that make up a change directly from a diff. You can commit your changes or open a pull request without leaving GitHub Desktop or using the command line.

Merge and Deploy

Browse commits on local and remote branches to quickly and clearly see what changes still need to be merged. You can also merge your code to the master branch for deployment right from the app.

GitHub-flow

Ready to start collaborating? Download GitHub Desktop. If you're using GitHub for Mac or Windows, the upgrade is automatic.

Adopting the Open Code of Conduct

Update: TODO Group will not be continuing work on the open code of conduct. See the followup post for more information.

We are proud to be working with the TODO Group on the Open Code of Conduct, an easy-to-reuse code of conduct for open source communities. We have adopted the Open Code of Conduct for the open source projects that we maintain, including Atom, Electron, Git LFS, and many others. The Open Code of Conduct does not apply to all activity on GitHub, and it does not alter GitHub's Terms of Service.

Open source software is used by everyone, and we believe open source communities should be a welcoming place for all participants. By adopting and upholding a Code of Conduct, communities can communicate their values, establish a baseline for how people are expected to treat each other, and outline a process for dealing with unwelcome behavior when it arises.

The Open Code of Conduct is inspired by the code of conducts and diversity statements from several other communities, including Django, Python, Ubuntu, Contributor Covenant, and Geek Feminism. These communities are leaders in making open source welcoming to everyone.

If your project doesn't already have a code of conduct, then we encourage you to check out the Open Code of Conduct and consider if your community can commit to upholding it. Read more about it on the TODO Group blog.

You can't perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Morty Proxy This is a proxified and sanitized view of the page, visit original site.