分开小提琴图的两半来比较尾部数据

问题描述 投票:0回答:2

有没有办法在物理上分离“分割”seaborn 小提琴图(或其他类型的小提琴图)的两半? 我试图比较两种不同的处理方法,但是尾巴很细,很难(不可能)判断分裂小提琴的一半还是两半一直上升到尾巴的尖端。

example violinplot

我的一个想法是,如果两半稍微分开而不是紧挨着,那么就很容易准确地吸收数据。

这是我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
import seaborn as sns

# load data into a dataframe
df1 = pd.read_excel('Modeling analysis charts.xlsx',
                   sheetname='lmps',
                   parse_cols=[0,5],
                   skiprows=0,
                   header=1)

# identify which dispatch run this data is from      
df1['Run']='Scheduling' 

# load data into a dataframe
df2 = pd.read_excel('Modeling analysis charts.xlsx',
                   sheetname='lmps',
                   parse_cols=[7,12],
                   skiprows=0,
                   header=1)

# identify which dispatch run this data is from
df2['Run']='Pricing' 

# drop rows with missing data
df1 = df1.dropna(how='any')
df2 = df2.dropna(how='any')

# merge data from different runs
df = pd.concat([df1,df2])

# LMPs are all opposite of actual values, so correct that
df['LMP'] = -df['LMP']

fontsize = 10

style.use('fivethirtyeight')

fig, axes = plt.subplots()

sns.violinplot(x='Scenario', y='LMP', hue='Run', split=True, data=df, inner=None, scale='area', bw=0.2, cut=0, linewidth=0.5, ax = axes)
axes.set_title('Day Ahead Market')

#axes.set_ylim([-15,90])
axes.yaxis.grid(True)
axes.set_xlabel('Scenario')
axes.set_ylabel('LMP ($/MWh)')

#plt.savefig('DAMarket.pdf', bbox_inches='tight')

plt.show()
matplotlib seaborn violin-plot
2个回答
4
投票

编辑#2:新版本的seaborn(> = 0.13.0)现在原生支持此功能。

使用

gap
关键字参数,例如

sns.violinplot(..., gap=0.1)

万岁 mwaskom & Co.

编辑:由于历史原因,这是公认的答案,但请查看@conchoecia更新且更清晰的实现。

好主意。我的实现的基本思想是绘制整个东西,抓取与两个半小提琴相对应的补丁,然后向左或向右移动这些补丁的路径。代码希望是不言自明的,否则请在评论中告诉我。

enter image description here

import numpy as np
import matplotlib.pyplot as plt;
import matplotlib.collections
import seaborn as sns
import pandas as pd

# create some data
n = 10000 # number of samples
c = 5 # classes
y = np.random.randn(n)
x = np.random.randint(0, c, size=n)
z = np.random.rand(n) > 0.5 # sub-class
data = pd.DataFrame(dict(x=x, y=y, z=z))

# initialise new axis;
# if there is random other crap on the axis (e.g. a previous plot),
# the hacky code below won't work
fig, ax = plt.subplots(1,1)

# plot
inner = None # Note: 'box' is default
ax = sns.violinplot(data=data, x='x', y='y', hue='z', split=True, inner=inner, ax=ax)

# offset stuff
delta = 0.02
for ii, item in enumerate(ax.collections):
    # axis contains PolyCollections and PathCollections
    if isinstance(item, matplotlib.collections.PolyCollection):
        # get path
        path, = item.get_paths()
        vertices = path.vertices

        # shift x-coordinates of path
        if not inner:
            if ii % 2: # -> to right
                vertices[:,0] += delta
            else: # -> to left
                vertices[:,0] -= delta
        else: # inner='box' adds another type of PollyCollection
            if ii % 3 == 0:
                vertices[:,0] -= delta
            elif ii % 3 == 1:
                vertices[:,0] += delta
            else: # ii % 3 = 2
                pass

4
投票

我扩展了上面 @Paul 的答案,并使其更加稳健。它现在支持垂直和水平方向,并且我将其实现为与 inside='sticks' 一起使用,因为这适合我的应用程序。

enter image description here

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections
import seaborn as sns
import pandas as pd

def offset_violinplot_halves(ax, delta, width, inner, direction):
    """
    This function offsets the halves of a violinplot to compare tails
    or to plot something else in between them. This is specifically designed
    for violinplots by Seaborn that use the option `split=True`.

    For lines, this works on the assumption that Seaborn plots everything with
     integers as the center.

    Args:
     <ax>    The axis that contains the violinplots.
     <delta> The amount of space to put between the two halves of the violinplot
     <width> The total width of the violinplot, as passed to sns.violinplot()
     <inner> The type of inner in the seaborn
     <direction> Orientation of violinplot. 'hotizontal' or 'vertical'.

    Returns:
     - NA, modifies the <ax> directly
    """
    # offset stuff
    if inner == 'sticks':
        lines = ax.get_lines()
        for line in lines:
            if direction == 'horizontal':
                data = line.get_ydata()
                print(data)
                if int(data[0] + 1)/int(data[1] + 1) < 1:
                    # type is top, move neg, direction backwards for horizontal
                    data -= delta
                else:
                    # type is bottom, move pos, direction backward for hori
                    data += delta
                line.set_ydata(data)
            elif direction == 'vertical':
                data = line.get_xdata()
                print(data)
                if int(data[0] + 1)/int(data[1] + 1) < 1:
                    # type is left, move neg
                    data -= delta
                else:
                    # type is left, move pos
                    data += delta
                line.set_xdata(data)


    for ii, item in enumerate(ax.collections):
        # axis contains PolyCollections and PathCollections
        if isinstance(item, matplotlib.collections.PolyCollection):
            # get path
            path, = item.get_paths()
            vertices = path.vertices
            half_type = _wedge_dir(vertices, direction)
            # shift x-coordinates of path
            if half_type in ['top','bottom']:
               if inner in ["sticks", None]:
                    if half_type == 'top': # -> up
                        vertices[:,1] -= delta
                    elif half_type == 'bottom': # -> down
                        vertices[:,1] += delta
            elif half_type in ['left', 'right']:
                if inner in ["sticks", None]:
                    if half_type == 'left': # -> left
                        vertices[:,0] -= delta
                    elif half_type == 'right': # -> down
                        vertices[:,0] += delta

def _wedge_dir(vertices, direction):
    """
    Args:
      <vertices>  The vertices from matplotlib.collections.PolyCollection
      <direction> Direction must be 'horizontal' or 'vertical' according to how
                   your plot is laid out.
    Returns:
      - a string in ['top', 'bottom', 'left', 'right'] that determines where the
         half of the violinplot is relative to the center.
    """
    if direction == 'horizontal':
        result = (direction, len(set(vertices[1:5,1])) == 1)
    elif direction == 'vertical':
        result = (direction, len(set(vertices[-3:-1,0])) == 1)
    outcome_key = {('horizontal', True): 'bottom',
                   ('horizontal', False): 'top',
                   ('vertical', True): 'left',
                   ('vertical', False): 'right'}
    # if the first couple x/y values after the start are the same, it
    #  is the input direction. If not, it is the opposite
    return outcome_key[result]

# create some data
n = 100 # number of samples
c = ['cats', 'rats', 'bears', 'pears', 'snares'] # classes
y = np.random.randn(n)
x = np.random.choice(c, size=n)
z = np.random.rand(n) > 0.5 # sub-class
data = pd.DataFrame(dict(x=x, y=y, z=z))
print('done making data')

# initialise new axes;
fig, (ax1, ax2) = plt.subplots(2)

inner = "sticks" # Note: 'box' is default
width = 0.75
delta = 0.05
final_width = width - delta
print(data)
sns.violinplot(data=data, x='y', y='x',
               split=True, hue = 'z',
               ax = ax1, inner='sticks',
               bw = 0.2)
sns.violinplot(data=data, x='x', y='y',
               split=True, hue = 'z',
               ax = ax2, inner='sticks',
               bw = 0.2)

offset_violinplot_halves(ax1, delta, final_width, inner, 'horizontal')
offset_violinplot_halves(ax2, delta, final_width, inner, 'vertical')

plt.show()
© www.soinside.com 2019 - 2024. All rights reserved.