Pythonで名前でデータを分類する

私はこれをdata（小文字）としています。これをスクリプトにインポートします。Pythonで名前でデータを分類する

LastName StartTime EndTime Duration Period TeamAbbrev Position 
Bouwmeester  0:00 0:37 0:37   1   STL   D 
Schwartz  0:00 0:40 0:40   1   STL   W 
Foligno   0:00 0:40 0:40   1   MIN   W 
Pietrangelo  0:00 0:48 0:48   1   STL   D 
Suter   0:00 0:40 0:40   1   MIN   D 
Staal   0:00 0:40 0:40   1   MIN   C 
Niederreiter 0:00 0:40 0:40   1   MIN   W 
Allen   0:00 20:00 20:00  1   STL   G 
Steen   0:00 0:30 0:30   1   STL   W 
Tarasenko  0:30 1:27 0:57   1   STL   W 
Parayko   0:37 1:43 1:06   1   STL   D

これは、スクリプト

import csv 
from itertools import combinations, product 

#Header = LastName StartTime EndTime Duration Period TeamAbbrev Position 

#Import Game 
with open('2017020397.csv', newline='') as f: 
    next(f) 
    skaters = '\n'.join(' '.join(row) for row in csv.reader(f)) 
    data = skaters.splitlines() 

def to_secs(ms): 
    ''' Convert a mm:ss string to seconds ''' 
    m, s = map(int, ms.split(':')) 
    return 60 * m + s 

# Store a list of (start, end) times for each player 
players = {} 
for row in data: 
    name, start, end = row.split(None, 3)[:3] 
    times = to_secs(start), to_secs(end) 
    players.setdefault(name, []).append(times) 

for t in players.items(): 
    print(t) 
print() 

# Determine the amount of overlapping time for each combination of players 
for p1, p2, p3 in combinations(sorted(players), 3): 
    total = 0 
    # Check each pair of times for this pair of players 
    for t1, t2, t3 in product(players[p1], players[p2], players[p3]): 
     # Compute the overlap in this pair of times and 
     # add it to the total for this pair of players 
     start, end = zip(t1, t2, t3) 
     total += max(0, min(end) - max(start)) 
    if total: 
     print(p1, p2, p3, total)

出力されます。

Allen Niederreiter Pietrangelo 5481 
Allen Niederreiter Prosser 2088 
Allen Niederreiter Reilly 1464

これの目的は、お互いに演奏どのようなチームメイトを確認することです。 Outputから、AllenからSTLまでがNiederreiterと一致することがわかります。MINからです。私はチームcombinationsのように結合するだけです。 TeamAbbrevはその識別方法です。もう1つの規定は、TeamAbbrevが、そのチームがその夜に何をしているかによって、ゲームによってゲームを変えるというものです。任意のすべての提案に開いて、ありがとう！

編集：int()がTeamAbbrevのstr()よりも簡単な場合は、数字の代わりにteamIdを削り取ることができます。

出典

2017-12-09 Michael T Johnson

基本的に**複数の異なるプレイヤーを名前で、チームごとに**印刷するだけですか？ 1つのチームが1つのチームを意味するように、 – IMCoins

私は、 'Allen'と' Niederreiter'の例のように、スクリプトが反対のプレイヤーを一緒にペアリングしないことを除いて、 'Output'が同じであると考えました。 –

後にplayers.setdefaultsをもう一度player.setdefaultsし、今回はrow.split（）ではなくindex [4]の上に同じリストに追加し、各プレイヤーのteamAbbrevを比較します。行開始、終了、teamAbbrev = zip（t1、t2、t3）に行を追加できます。それまでにすべてのチーム略語にアクセスできます。それは十分にあなたを助けますか？次に、if total：を変更して、teamAbbrevの一致を確認する条件を1つ以上追加します。 – IMCoins

data'であなたの'for行の後に追加...

teams = row.split()[4] 
# if the number of occurrences of the first item (which is a team)... is equal to the length of the list of teams, then, all the players are from the same team. 
if teams.count(teams[0]) == len(teams): 
    #same lines, but one indentation block due to the `if` condition.

出典

2017-12-09 04:32:25 IMCoins

'teams = row.split（なし、3）[4] IndexError：リストインデックスが範囲外にある ' –

ええ、申し訳ありません。私は更新する。ここ6時です。<< – IMCoins

心配しないで、助けてくれてありがとう！ –

あなたの質問は答えるのは簡単ではありませんが、私はしようとします。ファイルに記録時間は、1つのまたは2桁の形式でない場合、それは意志同期間

に

私は、彼らがプレイする場合にのみ、重複が起こることが想定：私はいくつかの仮定を作りました
私は、プレイヤーが1つの期間に複数回プレイできると思っています。
私はその期間が1桁で、値の代わりに値を使用します。

今二つのファイル：

2017020397.csv

LastName,StartTime,EndTime,Duration,Period,TeamAbbrev,Position 
Bouwmeester,0:00,0:37,0:37,1,STL,D 
Schwartz,0:00,0:40,0:40,1,STL,W 
Foligno,0:00,0:40,0:40,1,MIN,W 
Pietrangelo,0:00,0:48,0:48,1,STL,D 
Suter,0:00,0:40,0:40,1,MIN,D 
Staal,0:00,0:40,0:40,1,MIN,C 
Niederreiter,0:00,0:40,0:40,1,MIN,W 
Allen,0:00,20:00,20:00,1,STL,G 
Steen,0:00,0:30,0:30,1,STL,W 
Tarasenko,0:30,1:27,0:57,1,STL,W 
Parayko,0:37,1:43,1:06,1,STL,D

solution.py

import csv 
import re 
import itertools 

pattern_time = r"(\d{1,2}):(\d{1,2})" 
time_tester = re.compile(pattern_time) 

def convert_to_seconds(time_string): 
    ''' Convert a mm:ss string to seconds ''' 
    pattern_found = time_tester.match(time_string) 
    if pattern_found: 
     time_string_separated = pattern_found.group(1, 2) 
     minutes, seconds = map(int, time_string_separated) 
     return 60 * minutes + seconds 
    else: 
     # We have a problem 
     return 0 

file_name = '2017020397.csv' 
teams = {} 
number_of_players_to_compare = 3 

with open(file_name, newline='') as source_file: 
    csv_file = csv.DictReader(source_file) 
    for row in csv_file: 
     if row['TeamAbbrev'] not in teams: 
      teams[row['TeamAbbrev']] = {} 

     current_team = teams[row['TeamAbbrev']] 
     if row['Period'] not in current_team: 
      current_team[row['Period']] = {} 

     current_team_period = current_team[row['Period']] 
     if row['LastName'] not in current_team_period: 
      current_team_period[row['LastName']] = [] 

     current_skater = current_team_period[row['LastName']] 
     times_recorded = {'StartTime': convert_to_seconds(row['StartTime']), 
          'EndTime': convert_to_seconds(row['EndTime'])} 
     current_skater.append(times_recorded) 

for (current_team_to_show, current_periods) in teams.items(): 
    current_periods_sorted = sorted(current_periods) 
    for current_period_name in current_periods_sorted: 
     print("\nFor team", current_team_to_show, "in period", current_period_name, ":") 

     current_period = current_periods[current_period_name] 
     current_players = sorted(current_period) 
     for current_player_combination in itertools.combinations(current_players, number_of_players_to_compare): 
      total = 0 
      for times_this_combination in itertools.product(*(current_period[x] for x in current_player_combination)): 
       start_times = (x['StartTime'] for x in times_this_combination) 
       end_times = (x['EndTime'] for x in times_this_combination) 
       total += max(0, min(end_times) - max(start_times)) 

      print(" ".join(current_player_combination), total)

ここで私はそれをやった方法についていくつかのコメントがあります：

DictReader私は最初の行をスキップする必要がなく、行の各部分をその列名で取得できるようにしました。
私はデータを呼び出すためにネストされた辞書データ構造を使用しました。それは、チームの辞書の中の辞書の辞書の中のプレーヤーの辞書の中に記録された時間の辞書のリストです。

ご不明な点がございましたら、お気軽にお問い合わせください。

出典

2017-12-18 02:28:38 EvensF

Pythonで名前でデータを分類する

答えて

関連する問題