What is the standard for inclusion of features in pvlib? #1898

Oct 26, 2023

mikofski
Oct 26, 2023
Collaborator

Do pvlib features require a reference or citation from the literature? I can't find where this is explicitly stated. Should this even be a requirement?

PRO	CON
A reference or citation provides expanded detail about the feature to help readers better understand the original implementation and the origins of coefficients, parameters, and algorithms. This level of depth may not be appropriate for pvlib documentation.	There is no guarantee that a reference has any more depth than pvlib docs, and for simple methods, is such depth even required. Occasionally references don't even provide coefficients, or sufficiently explain algorithms or parameters. There is evidence in pvlib of methods that had conflicting terms depending on which papers presented them.
There are free & open access journals, self publishing is an option, and Zenodo generates DOI's	References may be an unnecessary barrier for new methods to be introduced. Conferences and workshops require travel and expenses that may be inhibitive for all contributors. Open Access journals may have excessive fees. Other open access journals may have low impact, but this matter?
Requiring a reference forces contributors to fully hash out methods and algorithms. The stated aim in the 2018 JOSS paper was to provide reference implementations of models relevant to solar energy, and this was echoed in the 2014 PVSC-40 pvlib paper: "PV_LIB provides a common repository for the release of published modeling algorithms"	Could or should pvlib be a reference itself? Should pvlib be a proving ground for new concepts? Or only as a way to vet and reuse established methods? How do we as a community determine if a method is established or useful?
Citations and references increase user confidence in pvlib if they are already familiar with the published methods implemented in pvlib, and therefore pvlib serves as a convenient standard implementation of these established methods, reducing errors and increasing repeatability	There are implementations in pvlib of methods that although published are not standard or established, solar is a fast changing field and new methods evolve continuously, pvlib can be a standard implementation to accelerate adoption of better methods, and many new users aren't aware of which methods are and should be most widely used. pvlib documentation could do a better job at guiding users to these methods
SciPy does require references for adding of new statistical distributions	Other libraries like NumPy or SciPy don't have this requirement explicitly stated.

wholmgren · Oct 26, 2023

cwhanse
Oct 26, 2023
Maintainer

I would emphasis a point alluded to by the third and fourth pros: an archival reference provides some confidence that future maintainers and users can resolve questions about algorithms or their implementation. This has been necessary several times, e.g., #1239. Having a reference doesn't ensure technical merit, but at least it should provide clarify.

Keeping implementations faithful to their references provides clarity and addresses one of the deficiencies in PV performance modeling that was identified at the original 2010 PVPMC workshop, that we all said we use the "Perez" model but all had different implementations whose results disagreed.

1 reply

wholmgren Nov 2, 2023
Maintainer

I agree that an archival reference is important. Ideally that reference is available for free (or at least a nearly equivalent reference such as a preprint is available for free). I don't care if that archive is on Zenodo, IEEE proceedings, or a traditional journal.

Oct 27, 2023

adriesse
Oct 27, 2023
Collaborator

Thank you @mikofski for the excellent start to an important discussion!

I see two questions here:

Is a publication necessary?
Is a publication sufficient?

I have been thinking that we could establish a (hopefully) simple points system for proposed features to achieve a more balanced assessment of suitability for inclusion in pvlib as opposed to accept/reject on the basis of this one specific criterion.

Aspects to be rated could be:

documentation (availability and quality of information)
- open access vs. paywall
- included in PR ++
- complete and sufficient to reproduce or recreate code
validation
- level of peer review
- availability of data and/or code to replicate results
- evidence of wide-spread use
usefulness for
- industry
- research
- learning

Of course there is a process question too: who evaluates and who decides?

Full disclosure: I would like to convince other maintainers to accept #1878 for inclusion in pvlib--it is currently blocked.

0 replies

cwhanse · Oct 28, 2023

mikofski
Oct 28, 2023
Collaborator Author

More points:

pro	con
The level of review required might be too great for pvlib. I prefer to push the burden on others. Even if they do a poor job, I don’t feel like it’s the responsibility of pvlib to do rigorous peer reviews. TBH if journals can’t even meet this standard how will we?	pvlib will have to do some level of review regardless, and clearly we can’t depend on journals & conferences to yield reproducible science
I’m even more concerned about the possibility of perceived bias. How easy would it be for one of the core maintainers to promote their personal agenda if there’s no independent standard.	Already several of us are part of large institutions that could publish and archive references. Does that meet the standard? Could we be perceived as taking advantage of our positions to promote our work. Are we already biased?

2 replies

cwhanse Oct 30, 2023
Maintainer

Con: Already several of us are part of large institutions that could publish and archive references. Does that meet the standard?

I don't see membership at an institution that publishes as a pro or a con. It's a tool that can provide an archival document when publication elsewhere isn't a viable option, usually due to length (short or long).

Interesting question if referencing an institutional report creates the appearance of bias. Are these reports perceived as being of lower technical quality because of the publisher?

adriesse Oct 30, 2023
Collaborator

I think its pretty hard to generalize about quality and thresholds, hence my suggestion to look at individual aspects of each proposed contribution. I have found that the quality of reports from Sandia, NREL, Fraunhofer and also independent consulting organizations varies.

Archival access can be ensured for posterity with a high degree of certainty in ways not possible a decade ago or more. Zenodo is one way, and it is open to all, so if the documentation of a proposed feature is lacking permanence it can simply be pushed to Zenodo to solve that specific problem.

The only way I see bias creeping in is through assumptions about certain sources or platforms or publishers. Or more specifically, through unchecked assumptions.

williamhobbs · Oct 29, 2023

williamhobbs
Oct 29, 2023

Good discussion!

I may be restating the obvious, but I've seen peer-reviewed publications that appeared to have had less thorough scrutiny/review than would happen as part of a pvlib PR.

I like @adriesse's list of aspects. And I like @cwhanse's point about archival references.

It seems to me that the requirements for something to be included might need to scale with the complexity and novelty of the contribution. Small modifications to existing approaches that have obvious benefits should be accepted with minimal hurdles, while a dramatically different approach might need more rigorous review/validation.

To @mikofski's point,

Could we be perceived as taking advantage of our positions to promote our work

That does seem like something to be careful of. On the other hand, there should be some benefits from being a contributor/maintainer. Maybe that isn't the right benefit...?

8 replies

williamhobbs Nov 2, 2023

@wholmgren, I had very similar thoughts about "maintainers' institutions" and DOE funding the labs to directly or indirectly contribute to pvlib. I wasn't sure how to put it into words, and you did a good job.

On the other other hand, Josh's slides from the DOE SETO open-source workshop last year [1] do say that pvlib-python is "Community property, not of any institution."

But, DOE funding has been, and I think will continue to be, critical for pvlib. So, I'm also okay with a little bit of [whatever is the right word that is sort of like favoritism] towards institutions with DOE (and, maybe, similar?) funding.

[1] https://www.energy.gov/sites/default/files/2022-10/DOE%20OSS%20Workshop%2C%20Josh%20Stein%2C%20Sandia.pdf

adriesse Nov 3, 2023
Collaborator

I think the statement that pvlib-python is "community property" is probably technically incorrect and its purpose was probably intended to be inspirational. Just for fun, here's how a recent SEJ paper that I happen to browse 5 minutes ago describes pvlib-python:

"𝙿𝚅𝙻𝚒𝚋 is a Python library, which contains many different routines geared towards the simulation of PV system performance, developed by the Sandia National Laboratories in the United States."

wholmgren Nov 3, 2023
Maintainer

The statement is not inspirational - recall #1692 for relatively recent discussion of copyright.

This is not the first time that I've seen a statement like the one you've quoted from that paper. It's disappointing. And speaks to the concerns that Adam raised in the linked PR.

adriesse Nov 4, 2023
Collaborator

The statement is not inspirational - recall #1692 for relatively recent discussion of copyright.

Yes, I was thinking about that discussion. The word "community" made me think of a much larger group that includes users, not just contributors.

kandersolar Nov 6, 2023
Maintainer

This is not the first time that I've seen a statement like the one you've quoted from that paper. It's disappointing.

I agree. I think editing the project descriptions in the RTD landing page and PyPI would be a good first step towards correcting this perception.

williamhobbs · Nov 2, 2023

mikofski
Nov 2, 2023
Collaborator Author

@williamhobbs I propose that if we have a dedicated group, that could have rotating membership and vary in size but should probably not be made entirely from maintainers and could have non-contributing members who are either users or industry experts, then perhaps in combination with a zenodo archive, they could be the judges of merit of new features based on yours and Anton’s points above? They could make recommendations to the maintainers and community at large? Something like that?

2 replies

williamhobbs Nov 2, 2023

I like this idea. Downside is that it could get tedious, and/or slow things down. It reminds me a bit of journal peer review or industry funding application review, which can both be slow processes.

adriesse Nov 3, 2023
Collaborator

I think there is no way of getting around group of evaluators of some sort. If there are criteria to be satisfied, someone has to evaluate them, and probably there will have to be some prescribed process as well. If the criteria and process are well-defined, though, I see no fundamental reason for it to be slow.

Nov 6, 2023

robwandrews
Nov 6, 2023

Hello folks, this is a great discussion and thought that I'd pop back out of the woodwork with some thoughts here. I'm admittedly very out the the loop on a lot of the recent developments on the package, so apologies if any of this is already covered.

When I think about pvlib, I think that the core use-case for it is that it is useful. For most people most of the time, that probably means that core workflows in the package are based on peer-reviewed and archival references, because it is useful to be able to trust their outputs as a part of a larger project, and to know that "someone" has looked at them.

However, for other users it is likely useful to have workflows and tools that haven't yet been reviewed but that have been developed specifically because they are useful. I would also imagine there are a large number of applications that will stand on the border between something that would make sense to peer review, and something that just makes sense in the flow of the model or workflow being implemented.

I would also suspect, as has been pointed out in this thread, that implementation into pvlib and running through the wide range of conditions and test cases for an algorithm is probably a more rigorous test of an algorithm than is required to pass peer review . I think there have been many good examples of the community finding and correcting issues with published models. An aspiration here might be that the pvlib implementation (with associated tests) is the reference implementation of a function (and the output that is incentivized by funding bodies), not the publication.

So, perhaps there are two (and maybe three) types of criterion for features:

Something that affects a core workflow: Core workflows as defined by this group, likely including model chain and other areas that are commonly used as part of larger workflows. Inclusion of updates to these workflows would require peer review, or possible a flag/warning if not based on a peer reviewed document. (e.g. you could run a homebrew backtracking algorithm that is in the package, but the standard implementation of pvlib will throw a warning). At some point in the future, this could also require physical test case development.
Functions outside core workflows: This does not require peer review, and flagging inside documentation that the function is not published may be sufficient for users of different technical levels to interpret. (call it a Beta or Labs feature, if we want to sound like a billionaire)
Utility functions? If it is useful, it can be included without reference, as @williamhobbs has said

Possibly this could achieve different streams of review, where items that are likely to impact core workflows receive a higher level of scrutiny - and review through a group as @mikofski has described - but other functions are able to be implemented and "put through their paces" without having to hit this threshold.

0 replies

Nov 27, 2023

adriesse
Nov 27, 2023
Collaborator

Any ideas how to wrap this discussion up?

0 replies

Apr 30, 2024

cwhanse
Apr 30, 2024
Maintainer

I propose a policy for references:

The intent of requiring a reference is to ensure that the pvlib code implements the authors' algorithms as faithfully as possible, and that pvlib maintainers and others can resolve questions about pvlib code.
References are required for functions that implement a model (e.g., clear-sky irradiance, cell temperature, mismatch loss) or that retrieve or read external data.
References should be archival documents that explain the model or data that is being read.
References may include code but should not exclusively comprise code. The concern here is that the algorithm may not be understandable without executing old code, which may be a problem for future maintainers.
References may include URLs but should not exclusively comprise webpages. Exceptions may be made for popular pvlib content when no other reference material is available.

0 replies

Search code, repositories, users, issues, pull requests...

What is the standard for inclusion of features in pvlib? #1898

Uh oh!

mikofski Oct 26, 2023 Collaborator

Replies: 8 comments · 13 replies

Uh oh!

cwhanse Oct 26, 2023 Maintainer

Uh oh!

wholmgren Nov 2, 2023 Maintainer

Uh oh!

Uh oh!

adriesse Oct 27, 2023 Collaborator

Uh oh!

Uh oh!

mikofski Oct 28, 2023 Collaborator Author

Uh oh!

cwhanse Oct 30, 2023 Maintainer

Uh oh!

adriesse Oct 30, 2023 Collaborator

Uh oh!

williamhobbs Oct 29, 2023

Uh oh!

williamhobbs Nov 2, 2023

Uh oh!

Uh oh!

adriesse Nov 3, 2023 Collaborator

Uh oh!

wholmgren Nov 3, 2023 Maintainer

Uh oh!

adriesse Nov 4, 2023 Collaborator

Uh oh!

kandersolar Nov 6, 2023 Maintainer

Uh oh!

mikofski Nov 2, 2023 Collaborator Author

Uh oh!

williamhobbs Nov 2, 2023

Uh oh!

Uh oh!

adriesse Nov 3, 2023 Collaborator

Uh oh!

robwandrews Nov 6, 2023

Uh oh!

adriesse Nov 27, 2023 Collaborator

Uh oh!

cwhanse Apr 30, 2024 Maintainer

mikofski
Oct 26, 2023
Collaborator

cwhanse
Oct 26, 2023
Maintainer

wholmgren Nov 2, 2023
Maintainer

adriesse
Oct 27, 2023
Collaborator

mikofski
Oct 28, 2023
Collaborator Author

cwhanse Oct 30, 2023
Maintainer

adriesse Oct 30, 2023
Collaborator

williamhobbs
Oct 29, 2023

adriesse Nov 3, 2023
Collaborator

wholmgren Nov 3, 2023
Maintainer

adriesse Nov 4, 2023
Collaborator

kandersolar Nov 6, 2023
Maintainer

mikofski
Nov 2, 2023
Collaborator Author

adriesse Nov 3, 2023
Collaborator

robwandrews
Nov 6, 2023

adriesse
Nov 27, 2023
Collaborator

cwhanse
Apr 30, 2024
Maintainer